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ABSTRACT 

This  report  presents  an  analysis  of  a  generalised  logistics  supply  chain  without  back  orders. 
Two  methods  are  proposed:  a  responsive  but  robust  delivery  system  based  on  maintaining  set 
holdings  levels;  and  a  technique  from  Control  Theory  which  pushes  stock  through  the  chain 
in  anticipation  of  demand  over  both  time  and  space.  Furthermore,  a  heuristic  is  proposed  to 
set  the  policy  for  holdings  levels  using  a  hybrid  of  statistical  analysis.  Simulated  Annealing 
and  Lagrangian  Relaxation.  Finally,  a  comparison  between  methods  under  uncertainty,  with 
error  in  prediction  and  correlated  demand,  is  conducted.  Each  method  was  found  to  be  useful 
in  different  contexts.  The  impact  of  uncertainty  and  correlated  consumption  was  quantified 
for  a  set  scenario  and  both  were  found  to  be  significant  factors  in  the  performance  of  the 
supply  chain. 
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A  Study  of  Logistics  Chains  for  Optimal  Supply  and 

Maximum  Throughput 

Executive  Summary 

Conceptual  incentives  in  logistics  management  and  control,  such  as  the  United  State's 
Office  of  Force  Transformation  Sense  and  Respond  Logistics  and  the  Australian 
Defence  Force's  Future  Joint  Logistics  Concept,  promote  investment  in  rapid 
distribution  supply  networks  based  on  adaptive  processes  across  the  Joint  domain.  It  is 
proposed  that  streamlined  "faster"  and  "leaner"  methods  of  resupply,  which  trade  off 
capacity  and  volume  against  rate  of  resupply,  have  the  potential  to  reduce  logistics 
footprints,  improve  distribution  times  and  lower  the  risk  of  shortfalls  in  supply. 

This  work  primarily  supports  the  defence  capability  Land  121  -  Field  Vehicle  and 
Trailer  Fleet  Modernisation  project  with  additional  input  to  the  JP  126  -  Joint  Theatre 
Distribution  System,  JP  2059  -  Bulk  Liquid  Distribution  and  JP  2077  -  Improved 
Logistics  Information  System  projects.  It  examines  the  practicality  of  the  new  just-in- 
time  precision  logistics  concepts  by  contrasting  a  purely  responsive  demand-orientated 
model  against  a  deliberately  planned  supply-orientated  model. 

It  is  determined  that  no  single  method  of  logistical  supply  is  a  panacea  for  designing, 
managing  and  controlling  logistics  systems.  The  relevance  and  appropriateness  of  each 
are  tied  to  the  predictability  of  the  logistics  system  and  its  behaviour.  Identifying  when 
each  approach  is  useful  is  then  more  important  than  promoting  any  one  approach 
above  others.  Furthermore,  techniques  for  responsive  systems,  planned  systems  and 
adaptive  systems  are  not  necessarily  mutually  exclusive  and  can  be  applied  in 
sequence  or  in  concurrence  for  some  systems.  The  benefits  of  one  particular  concept  or 
philosophy  over  another  becomes  a  moot  point  as  each  is  useful  in  context. 

Correlations  between  consumption  at  different  nodes  in  a  supply  chain  were  found  to 
significantly  reduce  the  performance  of  the  system.  This  highlights  the  need  to 
incorporate  a  level  of  buffering  in  warehousing  policies  to  counter  this  effect. 
Uncertainty  in  predicting  future  consumption,  and  different  probability  density 
functions  for  consumption  schedules  were  also  found  to  significantly  affect  the 
performance  of  the  system  for  the  scenario  under  investigation.  Together,  these  factors 
point  to  the  need  to  maintain  local  capacity  to  achieve  robust  performance  in  the  face  of 
unexpected  events. 

The  results  of  this  study  do  not  directly  contradict  the  ideas  presented  in  the  Australian 
Defence  Force's  Future  Joint  Logistics  Concept,  but  instead  serve  as  a  warning  that 
responsive  systems,  simple  or  complex,  will  not  and  cannot  entirely  replace  traditional 
planned  systems,  and  that  the  problems  associated  with  the  design,  implementation 
and  analysis  of  adaptive  precision  systems  are  non-trivial.  This  study  does  support  the 
conclusion  that  enhanced  capabilities,  emerging  technologies  and  network-enabled 
warfare  will  ultimately  facilitate  better  management  and  control  of  logistics  systems. 
However,  it  does  so  on  the  basis  that  these  new  developments  will  reduce  uncertainty 
across  the  system  and  therein  have  an  indirect  effect  on  the  system  rather  than  directly 
supporting  better  logistics  supply. 
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1  Introduction 


The  nature  of  Land  warfare  is  becoming  increasingly  complex,  diverse,  diffuse  and  lethal 
(Commonwealth  of  Australia,  2004).  However,  complex  terrain  and  the  changing  con¬ 
temporary  conflict  environment  offers  new  opportunities  as  well  as  new  threats  for  the 
Australian  Defence  Force  to  overcome.  Facilitated  by  enhanced  capabilities,  emerging 
technologies  and  network-enabled  warfare,  the  Future  Joint  Logistics  Concept  (Common¬ 
wealth  of  Australia,  2002)  provides  a  conceptual  foundation  for  the  evolution  of  logistics 
management  and  control  to  meet  the  challenge  of  future  warfighting. 

The  Australian  Defence  Force’s  Future  Joint  Logistics  Concept  promotes  an  agile  logistics 
network  which  is  focused  on  rapid  distribution  of  goods  and  services.  This  shift  from 
static  inventory  levels  and  stockpiling  policies  in  favour  of  faster  and  leaner  methods  of 
resupply  will  not  necessarily  eliminate  or  reduce  the  requirement  for  contingency  stock¬ 
piling  policies  but  has  the  potential  to  tradeoff  capacity  and  volume  for  increased  rate  of 
resupply.  It  is  proposed  that  adaptive  and  anticipatory  logistics  networks  streamlined  for 
rapid  distribution  have  the  potential  to  reduce  logistics  footprints,  improve  distribution 
times  and  lower  the  risk  of  shortfalls  in  supply. 

In  both  the  civilian  and  military  sectors,  the  justification  for  investing  in  Information  Tech¬ 
nology  improvements  for  logistics  supply  chains  has  traditionally  been  in  terms  of  greater 
efficiency.  Potential  improvements  include  “better  information  and  prediction  about  con¬ 
sumption  rates,  faster  transport,  cheaper  transport,  more  efficient  inter-modal  transfer, 
reduced  stockpiling,  quicker  delivery  or  shuttle  platforms,  better  information  and  predic¬ 
tion  about  transfer  rates,  better  visibility  into  critical  items  and  better  understanding 
of  item  sequencing”  (CAPT  Lewandowski,  L.  and  J.R.  Cares,  2005).  From  this  perspec¬ 
tive,  information  technology  plays  an  important  role  in  process  improvements  by  gathering 
more  information  on  consumption  levels  to  enable  better  prediction  of  future  consumption. 
Clearly,  the  efficiency  gains  are  dependent  on  the  ability  to  predict  future  consumption 
with  sufficient  accuracy  and  lead  time  to  evaluate  and  implement  the  optimised  supply 
schedule. 

Sense  and  Respond  (SR)  logistics  is  an  aspirational  goal  for  future  military  logistics  systems 
promoted  by  the  US  Office  of  Force  Transformation  (United  States  Department  of  Defense, 
2004)  and  is  also  reflected  in  the  Australian  Future  Joint  Logistics  Concept’s  desire  for 
Network  Enabled  Logistics  (Commonwealth  of  Australia,  2002).  SR  logistics  “is  envisioned 
as  an  approach  that  yields  adaptive,  responsive  demand  and  support  for  force  capability 
sustainment  that  operates  in  situation-conditioned  structures  that  recognize  operational 
context,  coherence,  and  coordination”  (United  States  Department  of  Defense,  2005).  Thus, 
the  role  of  information  technology  is  shifted  from  purely  facilitating  global  optimisation, 
to  enabling  robust  responses  to  local  conditions  throughout  the  logistics  network. 

This  study  examines  the  practicality  of  the  new  just-in-time  precision  logistics  concepts 
by  developing  and  subsequently  analysing  two  logistics  systems,  each  modelled  on  the 
opposing  but  not  necessarily  contradictory  viewpoints  introduced  above. 


In  Section  2,  a  responsive,  demand-orientated  model  is  formulated.  This  model  reacts 
to  its  environment  with  fixed  responsive  dynamics  without  prediction  of  future  demand 
or  adaptation  and  learning.  This  simple  model  captures  the  local  responsiveness  and 


1 


DSTO-TR  1807 


robustness  of  the  SR  concept,  although  the  role  of  prediction  and  adaptation  are  not 
modelled.  Using  this  model,  we  investigate  the  nature  of  stockholding  policies  in  linear 
supply  chains  under  our  interpretation  of  the  SR  policy  for  resupply.  The  ability  to  use 
warehouses  to  adequately  manage  the  risk  of  stockouts  and  shortfalls  in  supply  throughout 
the  system  is  investigated  -  that  is,  we  attempt  to  minimise  the  probability  of  ever  running 
out  of  stock  and  failing  to  meet  demand.  Under  SR  dynamics,  the  resupply  policy  is  fixed. 
We  then  develop  a  method  to  determine  stockholding  policies  for  warehouses  throughout 
the  chain  subject  to  known  constraints  using  a  hybrid  technique  combining  aspects  of 
statistics,  Simulated  Annealing  (Kirkpatrick  et  ah,  1983;  Cerny,  1985)  and  Lagrangian 
Relaxation  (Everett,  1963;  Fisher,  1985).  Hence,  we  investigate  the  premise  that  a  logistics 
network  can  be  streamlined,  maintaining  low  holdings  and  capacity,  while  still  maintaining 
adequate  distribution  and  throughput. 

In  Section  3,  a  deliberately  planned  supply-orientated  model  is  formulated.  This  model 
is  fundamentally  different  to  the  SR  model  of  Section  2.  It  reacts  to  its  environment 
using  predictive  dynamics  under  the  assumption  of  perfect  foresight  with  compensation 
for  error  and  uncertainty.  In  this  model,  we  fix  the  holdings  policies  throughout  the 
system  and  develop  a  method  to  determine  an  optimal  routing  or  schedule  for  goods 
using  a  technique  from  Control  Theory  called  Dynamic  Programming  (DP)  (Bellman, 
1952).  The  DP  model  instantiates  the  optimisation  concept  for  our  linear  supply  chain. 
The  problem  is  highly  abstracted,  since  the  motivation  is  to  understand  the  dynamics 
of  efficient  supply  chains  rather  than  build  an  accurate  model  of  contemporary  military 
logistics  systems.  Nevertheless  the  DP  model  provides  us  with  a  sufficient  representation 
of  the  defining  characteristics  of  the  optimised  logistics  supply  chain  concept  to  quantify 
some  aspects  of  the  performance  of  these  systems. 

Finally,  we  relax  the  assumption  of  perfect  foresight  and  introduce  stochastic  noise  into 
predictions  of  future  consumption  as  well  as  correlating  consumption  to  introduce  occa¬ 
sional  catastrophic  failures  in  the  chains.  This  allows  us  to  explore  some  of  the  problems 
associated  with  the  practical  application  of  DP  and  to  compare  it  against  the  fixed  SR 
policy  introduced  in  Section  4.1. 

A  discussion  of  the  results  and  further  research  is  provided  in  Section  5.  Conclusions  are 
drawn  in  Section  6. 


2  Logistics  Modelling 


In  this  section  we  introduce  the  concept  of  a  generalised  logistics  network.  A  logistics 
chain  is  then  defined  as  a  special  instance  of  this  network.  SR  dynamics  are  introduced 
to  describe  the  manner  in  which  goods  are  resupplied  throughout  the  chain.  With  this 
model,  we  present  a  technique  to  determine  stockholding  policies  for  warehouses  through¬ 
out  the  chain  with  the  ultimate  goal  to  minimise  the  risk  of  stockouts  and  shortfalls  in 
consumption.  This  work  primarily  provides  an  analytical  conceptual  framework  for  the 
study  of  SR  dynamics,  including  the  idea  that  volume  and  capacity  can  be  reduced  while 
adequate  distribution  and  throughput  are  still  maintained.  An  illustrative  example  of  how 
the  framework  might  be  applied  in  practice  is  also  provided.  However,  no  empirical  data 
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is  presented  from  military  supply  chains  in  the  real  world  so  all  examples  are  indicative  of 
potential  applications  rather  than  demonstratively  and  provably  practical  systems. 

2.1  Logistics  Supply  Network 

A  logistics  supply  network  is  modelled  as  a  set  of  network  nodes  JV  and  a  set  of  directed 
arcs  joining  these  nodes.  Nodes  in  the  network  are  uniquely  identified  by  an  ordinal 
labelling  over  the  natural  numbers  N.  For  simplicity,  we  assume  that  two  nodes  in  the 
network  are  connected  by  at  most  one  arc.  Hence,  directed  arcs  are  uniquely  identified  by 
the  ordered  node  tuple  (i,  j),  i,j  G  jY .  These  nodes  and  arcs  represent  entities  in  the  real 
world  such  as  warehouses,  consumers  or  manufacturers  and  roads,  pipelines  or  telephone 
lines,  for  example.  A  logistics  supply  network  is  then  formally  defined  as  the  directed 
graph  G  = 

Every  node  i  G  jV  at  the  instance  in  time  f  £  5”  is  attributed  with  the  following  charac¬ 
teristics: 

•  a  current  holdings  level  h\ ,  which  denotes  the  amount  of  stock  held  by  the  node  i  at 
time  t ■ 

•  a  desired  holdings  level  Hj,  which  denotes  the  policy  for  the  preferred  level  of  stock 
to  be  held  at  node  i  at  time  t; 

•  a  warehouse  capacity  level  Wf ,  which  denotes  the  maximum  possible  amount  of  stock 
that  may  be  held  at  node  i  at  time  t; 

•  a  current  consumption  level  cj,  which  denotes  the  expenditure  or  usage  of  stock  at 
node  i  at  time  f;  and 

•  a  desired  consumption  level  C\.  which  denotes  the  demand  or  preferred  level  of 
consumption  for  stock  at  node  i  at  time  t. 

Every  arc  ( i,j )  G  sY  at  the  instance  in  time  t  G  2Y  is  attributed  with  the  following 
characteristics: 

•  a  current  throughput  level  8j  ■ ,  which  denotes  the  amount  of  stock  transferred  from 
the  node  i  at  the  time  t  to  the  node  j  at  time  t  +  1;  and 

•  a  maximum  throughput  A*  ■ ,  which  denotes  the  maximum  possible  amount  of  stock 
that  may  be  transferred  from  the  node  i  at  time  t  to  the  node  j  at  time  t  +  1. 

The  consumption  at  time  t  for  all  nodes  i  is  given  by 

J=f  Ct>h\-\ 

i  \  Cl  cl  <  h\-\ 

Hence,  nodes  in  the  network  have  no  choice  but  to  meet  the  preferred  level  of  consumption 
Cf  at  time  t  using  the  stock  holdings  carried  over  from  the  previous  time  period  t  —  1. 
That  is,  nodes  may  not  willingly  hold  back  stock  when  it  is  in  demand. 

The  holdings  level  h\  for  every  node  i  G  jV  at  the  instance  in  time  t  G  2Y  is  then  the 
stock  carried  over  from  the  previous  time  period  t  —  1,  less  the  consumption  at  t,  less  the 


(1) 
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difference  between  outgoing  stock  transfers  and  incoming  stock  transfers,  given  that  the 
maximum  warehouse  capacity  Wf  cannot  be  exceeded.  Formally, 


hi  =  min  {IT* ,  h\  1 +  (2) 

Here,  we  expect  the  warehouse  capacity  IT*  to  be  constant  in  time  at  each  node  in  most 
cases.  However,  we  allow  the  capacity  to  vary  in  time  to  allow,  for  example,  policy 
restrictions  preventing  the  entire  warehouse  being  utilised  and  expansions  to  the  warehouse 
capacity. 

To  prohibit  nodes  from  transferring  more  stock  than  is  held  by  those  nodes  during  time 

<  ht‘ -  4-  (3) 

j¥=i 

Hence,  nodes  may  only  transfer  stock  carried  over  from  the  previous  time  step  t  —  1  in 
excess  of  the  consumption  level  c\. 

In  addition,  we  define  the  quantity  Q  as  the  base  multiple  or  unit  size  for  stock.  Stock 
may  only  be  transferred  between  nodes  in  multiples  of  Q  so  that 


=  Ql\ 


(4) 


for  'yjj  G  N. 

We  propose  a  penalty  function  which  is  based  on  two  terms:  a  penalty  term  for  not 
maintaining  the  desired  holdings  level;  and  a  penalty  term  for  not  supplying  enough  stock 
for  consumption.  We  define  this,  for  the  nodes  i  G  <yF/{0},  as 


?r(M)  =  A\ Hj 


h\\*  +  B\Cl-d 


t\/3 


(5) 


where  A.  B,  a  and  (3  are  constants.  Here,  many  different  forms  of  penalty  functions  could 
be  used  as  the  generality  of  the  techniques  we  apply  does  not  depend  on  the  specific  form 
of  the  function.  However,  outcomes  of  particular  applications  of  those  techniques  will 
ultimately  differ  based  on  the  forms  chosen. 

The  node  0  is  considered  to  be  the  source  of  all  stock.  This  node  does  not  consume  stock, 
has  no  warehouse  limit  and  always  has  enough  stock  to  transfer. 


2.2  Logistics  Supply  Chain 

A  logistics  chain  is  a  particular  instance  of  a  logistics  network,  as  given  in  Section  2.1. 
We  formally  define  the  chain  linear  supply  chain  of  length  L  +  1  as  the  directed  graph 
G  =  {JG  in  which  jV  =  {0,. . .  ,L}  and  s/  =  {{i  —  G  ^/{O}}. 

In  general,  there  are  many  possible  ways  to  schedule  the  flow  of  goods  within  the  chain, 
each  potentially  differing  in  value  as  determined  by  equation  (5).  Although  the  number 
of  distinct  ways  to  schedule  the  flow  of  goods  in  our  logistics  chain  is  finite  for  each  given 
chain,  enumeration  over  the  space  of  all  candidate  schedules  for  a  set  of  optimal  solutions 
is  impractical  for  all  but  the  simplest  of  chains.  Furthermore,  an  optimal  schedule  for 
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some  particular  instance  of  a  chain  tells  us  little  about  the  optimal  schedule  for  another. 
It  is  more  important  to  identify  general  techniques  or  algorithms  to  allow  us  to  resupply 
the  chain  across  all  possible  chains.  The  aim  then  is  to  identify  the  best  possible  algorithm 
across  a  range  of  chains  or  at  least  to  identify  one  or  more  useful  and  practical  algorithms 
which  assist  us  to  understand  the  properties  of  the  chain. 

One  of  the  simplest  techniques  for  restocking  nodes  in  the  logistics  supply  chain  is  to 
adopt  a  SR  regime.  Simply  put,  nodes  in  the  chain  consume  goods  following  equation  (1) 
and  subsequently  place  an  order  o\  for  stock  with  node  i  —  1  according  to  the  recurrence 
relation 


o\  =  o\+l  +  Q (rnin{  Hj  -  (1 

'4  1  -  c-),  h\_\  -  c\_ i,  mod  Q), 

(6) 

for  i  £  <yF/{0,  L}  and 

4  =  Q(min{iTl  -  (h^1 

~  cl)j  4-1  —  cL-l!  m°d  Q)i 

(7) 

otherwise. 


Although  equations  (6)  and  (7)  seem  complicated,  they  are  actually  quite  straightforward. 
For  each  i  £  c/F/jO},  these  equations  merely  recursively  compute  the  cumulative  order 
across  the  sub-chain  j  >  i  in  multiples  of  Q.  The  node  0,  having  unlimited  stock,  places 
no  orders.  Then 

%-i,i  =  °h  (8) 

describes  the  dynamics  of  the  supply  scheduling.  Here  we  assume  instantaneous  through¬ 
put. 


2.3  Heuristics  for  Time— Invariant  Holding  Policies 

Let  X{  =  Xj,  i  £  jV ,  i  £  J,  denote  the  random  variables  whose  outcomes  set  the 
consumption  values  Cj.  We  make  the  assumption  of  time-invariance  for  simplicity  and 
tractability. 

For  a  fixed  linear  supply  chain  G ,  we  wish  to  guarantee  that  the  likelihood  P [h\  <  Xi)  of 
ever  having  holdings  less  than  or  equal  to  the  fixed  amount  xt  is  below  some  stipulated 
acceptable  risk  probability  p.  This  section  presents  a  heuristic  to  set  constant  warehouse 
policy  levels  Hi  =  Hj  in  order  to  meet  the  criterion 

P {h\  <  Xi)  <  pi,  (9) 

for  constants  Xi  and  p.  Alternatively,  we  may  wish  to  guarantee  that  the  likelihood 
P (Cj  —  cj  >  Hi)  of  a  shortfall  in  desired  consumption  C\  of  more  than  the  fixed  amount  yl 
is  below  some  stipulated  acceptable  risk  probability  pt .  That  is 

p  {C-  -c\>  Vi)  <  Qi.  (10) 


The  assumption  is  made  that  we  are  able  to  sample  instances  or  access  data  about  the 
underlying  distributions  of  Xi  but  do  not  know  the  true  distributions  of  X%.  For  example, 
it  is  a  reasonably  simple  exercise  to  measure  the  average  rainfall  on  a  given  day  at  some 
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fixed  location.  However,  it  is  a  non-trivial  exercise  to  develop  a  rainfall  forecasting  model 
or  even  to  fit  sample  data  to  a  distribution  with  total  certainty.  The  techniques  we  apply 
are  then  practical,  within  reasonable  limits,  and  do  not  make  unnecessary  demands  on 
knowing  all  aspects  of  the  chain.  In  simulating  the  chain,  we  set  the  distributions  for 
Xj  and  sample  those  distributions  a  number  of  times  to  form  unbiased  estimators,  which 
approach  Xi  as  n  approaches  infinity,  in  order  to  provide  examples  of  our  techniques. 
These  examples  are  not  to  be  confused  with  the  general  technique  we  propose. 

The  ability  to  ever  satisfy  inequality  (9)  for  x  =  0,  p*  =  0  depends  on  the  distribution 
of  Xi.  For  example,  if  Xi  is  uniformly  distributed  over  the  integers  in  [0, 10]  then  it  is 
certainly  possible  to  guarantee  that  node  1  never  runs  out  of  stock  simply  by  setting  the 
holdings  policies  across  the  chain  to  sufficiently  large  integers.  However,  if  Xi  is  normally 
distributed  then  it  is  not  possible  to  guarantee  that  the  nodes  will  never  run  out  of  stock 
for  any  finite  holdings  policy  Hi  because  the  upper  tail  end  of  the  normal  distribution  is 
not  bounded.  The  probability  of  running  out  of  stock  quickly  diminishes  as  Hi  increases 
so  that  setting  Hj  to  an  arbitrarily  large  number  is  an  easy  way  to  satisfy  our  requirements 
for  all  practical  purposes.  However,  in  reality  Hi  is  limited  by  physical  constraints  such 
as  warehouse  sizes,  monetary  considerations  or  other  issues  which  prevent  us  from  having 
arbitrarily  large  warehouse  holdings  policies.  In  this  situation,  it  makes  sense  to  minimise 
the  total  holding  policies  across  the  chain  subject  to  the  constraints  (9)  or  (10). 


minimise  *22  Hi, 

such  that  P (h\  <  Xi)  <  pi  or  P (C\  —  c\  >  yi)  <  Qi. 


(11) 


Alternatively,  define  a  maximum  Hmax  for  the  holdings  policies  across  the  chain  and 
minimise  the  total  system  failure  probabilities  across  the  chain. 

minimise  E  P(h(  <  Xi)  or  EF(c?  ~4  >yi),  (12) 

such  that  E  Hi  <  Hmax- 


Both  systems  (11)  and  (12)  are  difficult  to  solve  as  they  stand.  We  use  Lagrangian 
Relaxation  (Everett,  1963;  Fisher,  1985)  to  translate  our  original  problem  into  a  simpler 
form. 

Considering  now  only  cases  for  equation  (9),  the  program  (11)  becomes 


maximise  2z?((/>) 
such  that  i -pi 


=  min{E  Hi  +  E(<MF(^  <  %i)  ~  Pi))}, 

>  0, 


and  the  program  (12)  becomes 

maximise  2z?(V’)  =  min{E  P {h\  <  Xi)  +  ^(E  Ht  -  Hmax)}, 
such  that  ip  >  0, 


(13) 


(14) 


where  2z?  is  the  objective  function,  cp  =  ((pi),  i  6  JV  is  the  vector  of  Lagrangian  multi¬ 
pliers  and  ^  is  a  scalar  Lagrangian  multiplier.  The  cases  for  equation  (10)  follow  mutatis 
mutandis. 

To  explain  how  Lagrangian  Relaxation  works  in  our  case,  consider  the  objective  function 
Jz?^)  in  the  program  (14).  For  ip  — >  0,  the  value  of  the  objective  function  is  dominated 
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by  the  first  term  ^  ¥(hj  <  Xj)  so  that  the  minimisation  will  set  the  holdings  Hi  as  large 
as  possible  and  the  probability  of  holding  less  than  Xi  becomes  arbitrarily  small,  assuming 
also  that  iJj  ^  H{  — s ►  0  as  i/)  — ►  0,  which  is  reasonable  when  ^  HL  is  bounded.  The  value  of 
the  objective  function  approaches  0  from  above.  On  the  other  hand,  as  ip  — >  oo  the  value 
of  the  objective  function  is  dominated  by  the  second  term  V’E  ~  Hmax)  since  the  first 
term  can  add  to  at  most  L  when  P (h\  <  Xi)  =  1  for  every  i  =  1 ...  L.  As  the  holding 
policies  for  all  nodes  is  generally  expected  to  be  at  least  1  parcel  of  size  Q  then  Hmax 
should  be  set  by  the  user  to  at  least  QL  and  is  usually  significantly  greater.  However,  we 
allow  Hi  =  0  even  though  this  makes  little  sense  in  practice.  The  value  of  the  objective 
function  approaches  (L  —  ipH.max)  — >  —  oo  as  ^  — ■>  oo.  It  is  not  difficult  to  see  that,  in 
the  minimisation,  there  exists  some  finite  ^  >  0  which  causes  ^  Hi  to  equal  values  other 
than  arbitrarily  large  numbers  or  0.  For  example,  if  such  a  tj}  were  to  cause  ^  Hi  to  equal 
Hmax  then  the  second  term  in  the  objective  function  is  eliminated  but  the  first  term  in  the 
objective  function  is  likely  to  exceed  0  depending  on  whether  Xi  is  bounded  to  some  finite 
amount,  in  which  case  there  might  be  no  gain  in  raising  ^  Hi  beyond  some  fixed  amount 
which  suffices  to  ensure  ^  P (h\  <  Xi)  =  0.  Otherwise,  the  value  of  the  objective  function 
is  strictly  positive.  It  is  a  ip  of  this  kind  which  solves  our  problem.  This  appeals  to  reason 
because  observing  the  initial  problem  (12)  we  can  intuitively  see  that  solutions  where  ^2  Hi 
is  close  to  Hmax  are  likely  to  be  optimal.  The  Lagrangian  Relaxation  (13)  works  in  a  more 
complicated  but  analogous  way  to  (14).  To  solve  our  Lagrangian  Relaxation  problems, 
we  propose  to  use  Simulated  Annealing  (SA)  (Kirkpatrick  et  al.,  1983;  Cerny,  1985)  as 
explained  in  Appendix  B. 

The  implementation  of  the  SA  algorithm  for  our  problem  starts  by  identifying  the  state 
space,  which,  given  that  we  make  no  undue  assumptions  about  the  distributions  of  Xi, 
i  £  jV ,  is  not  trivial.  We  have  already  indicated  that,  for  values  of  pi  in  the  limit 
approaching  zero,  the  Hi  may  become  arbitrarily  large.  Hence,  we  require  an  approach 
which  bounds  H{  based  on  some  measurement  of  statistical  likelihood.  The  technique  we 
employ  samples  the  distributions  of  Xi,  T  =  1000  independent  times.  These  samples  are 
used  to  construct  an  approximation  to  the  distribution  of  the  random  variable  Yj,  whose 
outcome  describes  the  average  of  the  sum  of  the  Xj  over  j  >  i.  The  reasoning  here  is 
that  on  average  the  node  i  needs  to  hold  enough  to  meet  its  own  consumption  and  also 
to  fill  any  stock  order  made  by  the  node  i  +  1 .  Node  i  +  1  orders  stock  based  on  its  own 
consumption  and  the  order  made  by  the  node  i  +  2.  With  recursion,  we  propose  that  on 
average  the  node  i  is  required  to  hold  enough  to  fill  its  own  consumption  requirements 
and  also  enough  that  the  consumption  requirements  of  all  nodes  j  >  i  are  met.  This 
embraces  the  concept  of  throughput  over  capacity.  It  is  not  necessarily  true  that  node 
i  needs  to  meet  the  total  desired  consumption  of  the  sub-chain  j  >  i.  It  could  be  that 
nodes  stockpile  against  the  likelihood  of  not  being  supplied.  In  our  systems,  we  wish  to 
reduce  the  stockpiling  levels  and  concentrate  on  the  benefits  of  increasing  throughput  with 
appropriate  holding  policies.  Hence,  our  approach  is  reasonable  within  this  context. 

A  histogram  for  the  sample  outcomes  of  Y),  i  G  JV  is  obtained  using  Monte  Carlo  sampling. 
This  histogram  is  used  to  set  upper  bound  for  Hi  based  on  the  tail  probability  ?  intended 
to  capture  the  upper  percentile  of  Yj.  Figures  1,  2  and  3  display  the  average  outcomes 
of  Yi  over  100  independent  simulations  each  lasting  1000  time  steps  for  nodes  1,  2  and  3 
respectively  in  a  4  node  chain.  The  error  bars  denote  95%  confidence  intervals  using  a 
f-statistic.  In  these  figures,  the  upper  tail  end  of  the  histograms  are  shaded  in  red  to  show 
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Outcome  Y  at  node  1 

Figure  1:  Histogram  of  sample  outcomes  of  Y  at  node  1  over  100  independent  simulations 
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Figure  2:  Histogram  of  sample  outcomes  of  Y  at  node  2  over  100  independent  simulations 
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Figure  3:  Histogram  of  sample  outcomes  of  Y  at  node  3  over  100  independent  simulations 
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the  upper  ?  =  0.05%  of  the  histrograms.  The  bound  of  zero  is  used  as  the  lowest  value  for 
Hi  in  this  model.  However,  the  lower  q  percentile  of  Yt  is  also  shaded  red  because  setting 
a  lower  bound  for  Hi  is  useful  in  many  practical  situations. 

The  initial  state  of  the  system  is  set  to 


H*  =  Hi  = 


t= 1  j>r 


(15) 


for  i  G  jV /{$}.  The  Hi  values  set  in  this  step  then  denote  the  sum  of  the  sample  means 
for  consumption  of  the  node  i  itself  and  all  nodes  that  node  i  supplies  and  is  an  unbiased 
approximation  of  the  expected  value  E[Yi\. 

To  implement  the  SA  algorithm,  we  need  to  be  able  to  perturbate  the  system  -  that  is: 
for  each  temperature  level  TgT  define  the  disturbance  functions  '■  — >  S  which  takes 

as  input  the  current  state  of  the  system  and  generates  a  new  state.  Let  H^igh  and  Hl°w 
denote  the  maximum  and  minimum  values  for  node  i  G  o/L/{ 0}  as  determined  by  the 
upper  and  lower  c,  percentiles  of  Yl.  Again,  we  use  H\ow  =  0  for  reasons  that  will  become 
apparent  shortly.  Then,  the  disturbance  function  £  is  implemented  to  select  a  node  at 
random,  excluding  the  source  node  0,  and  add  or  subtract  a  uniformly  distributed  random 
number.  This  number  is  generated  between  1  and  \{H^l9h  —  Hl°w)  with  equal  likelihood 
from  its  holding  policy  to  generate  a  new  state,  where  A  =  0.1  is  a  proportionality  constant. 
The  outcome  of  the  algorithm  does  depend  somewhat  on  A.  The  value  of  0.1  is  chosen 
because  it  was  observed  to  be  effective  -  that  is;  the  algorithm  converged  to  a  “better” 
solution  more  quickly  than  other  values  we  tested.  We  use  the  cooling  schedule 


Tj  =  Tj-i/2,  (16) 

for  j  =  1, ...  ,7  with  initial  temperature  To  =  1. 

To  demonstrate  our  method,  we  compare  two  supply  chains  of  length  four.  In  the  first 
instance,  SA  is  used  to  provide  an  approximate  solution  to  a  Lagrangian  Relaxation  prob¬ 
lem  of  the  type  (14)  with  Hmax  =  25,  where  the  second  term  in  the  objective  function 
from  (12)  is  used  -  that  is,  )E  P(C';-  —  c\  >  yi),  for  yi  =  0  Vi.  In  the  second  instance,  Brute 
Force  (BF)  enumeration  over  all  feasible  solutions  is  used  to  solve  the  same  Lagrangian 
Relaxation  problem  optimally.  Desired  consumption  is  normally  distributed  with  a  mean 
of  5  and  a  standard  deviation  of  3  so  that  the  underlying  simulations  of  the  logistics 
chains  are  stochastic  and  optimality  is  itself  subject  to  random  stochastic  noise.  A  single 
simulation  of  each  chain  lasts  1000  steps  in  time.  The  inner  Metropolis  loop  of  the  SA 
algorithm  perturbates  the  system  K  =  100  times  at  each  temperature  level.  Each  of  the 
two  approaches  are  repeated  a  total  of  100  independent  times.  Results  are  provided  in 
Table  1.  In  this  table,  the  column  labelled  BF  describes  the  best  value  of  the  objective 
function  Jzf  (£>)  found  by  completely  enumerating  over  the  state  space,  the  column  labelled 
SA*  describes  the  best  value  of  the  objective  function  found  over  all  100  independent  trials 
and  the  SA  describes  the  mean  value. 


The  best  solution  found  for  Hi  across  the  chain  occurs  at  V’  =  0.06  with  value  of  2z ?(<f>)  = 
0.0783.  For  values  of  ^  between  0.55  and  0.65  in  steps  of  0.001  a  better  solution  at 
'(/’  =  0.059  with  value  2z?(0)  =  0.814  is  located.  Random  variation  in  the  results  impedes 
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Table  1:  Results  of  simulations  for  BF  enumeration  and  SA  in  a  four  node  chain  for  values 
of  if  between  0.03  and  0.10 


if 

BF 

m 

T,Hi 

SA* 

m 

EHi 

SA 

SDev 

0.03 

0.527 

(17,  13,  10) 

40 

0.533 

(17,  13,  9) 

39 

0.615 

0.065 

0.04 

0.665 

(17,  12,  8) 

37 

0.628 

(16,  12,  9) 

37 

0.753 

0.093 

0.05 

0.752 

(15,  12,  7) 

34 

0.753 

(15,  11,  7) 

33 

0.897 

0.120 

0.06 

0.783 

(10,  7,  0) 

17 

0.755 

(11,  8,  0) 

19 

0.890 

0.099 

0.07 

0.718 

(10,  7,  0) 

17 

0.696 

(11,  7,  0) 

17 

0.782 

0.068 

0.08 

0.610 

(7,  0,  0) 

7 

0.595 

(7,  0,  0) 

7 

0.673 

0.070 

0.09 

0.423 

(6,  0,  0) 

6 

0.415 

(6,  0,  0) 

6 

0.529 

0.071 

0.10 

0.266 

(5,  0,  0) 

5 

0.241 

(7,  0,  0) 

7 

0.379 

0.090 

Total  H 
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Figure  4-'  Graph  of  the  total  of  the  holdings  policies  in  the  chain  and  the  utility  2z?(</>)  for 
BF  enumeration  and  SA  plotted  against  if 


our  ability  to  accurately  calculate  the  objective  function  beyond  this  sensitivity.  A  phase 
type  behaviour  is  observed  between  if  =  0.057  and  if  =  0.058  where  the  optimal  holdings 
policies  change  from  (14, 10,4)  to  (10,6,0).  Overall,  the  dynamics  appear  to  contain  two 
phase  transitions  in  the  BF  Hi  plot  of  Figure  4.  A  shift  in  policies  between  those  in 
which  H3  >  0  and  H3  =  0  is  important  because  if  at  any  time  H3  >  0  then  there  must  be 
sufficient  capacity  and  throughput  in  all  other  nodes  for  the  investment  to  be  worthwhile. 
When  H3  =  0,  an  increased  penalty  is  invoked  in  not  meeting  desired  consumption  at  node 
3  but  the  holdings  policies  of  all  other  nodes  are  reduced  to  offset  the  loss.  To  demonstrate 
this  principle,  we  ran  a  BF  enumeration  with  the  additional  constraint  H3  >  0.  The 
optimal  policy  for  the  system  shifted  from  (10,  6,  0)  to  (9, 6, 3).  In  this  situation  one  might 
have  assumed  that  H3  would  be  set  to  the  lowest  feasible  value,  namely  unity.  Instead,  it 
was  set  to  a  value  of  3  due  to  the  higher-order  non-linear  dynamics  of  the  chain.  Obviously, 
by  enforcing  the  constraint  H3  >  0  we  are  forcing  the  system  to  stay  in  one  phase.  This 
is  important  because  it  clearly  demonstrates  that  discontinuities  in  the  chain  exist;  that 
is,  points  at  which  two  distinct  solutions  exist  with  equal  value.  Care  must  then  be  taken 
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in  the  neighborhood  of  transition  points. 

Figure  4  graphs  the  results  of  Table  1,  where  the  rightmost  ordinate  axis  (labelled  ‘Utility1) 
measures  the  value  of  the  objective  function  and  the  leftmost  ordinate  axis  (labelled 

‘Total  H‘)  measures  the  22  Hi.  In  this  figure,  the  curve  depicting  the  minimum  value  of 
Jz? ((/>),  obtained  for  each  fixed  tp  in  the  SA  algorithm,  actually  lies  below  that  obtained 
using  brute  force  enumeration.  This  is  clearly  an  aberration  in  our  method  caused  by 
random  variation  across  the  consumption  generated  in  each  independent  simulation.  The 
reason  is  one  or  more  of  the  100  independent  SA  results  are  simply  better  than  the  result 
of  the  single  simulation  using  brute  force  enumeration.  The  concept  is  somewhat  similar 
to  rolling  a  die  once  and  then  trying  to  beat  that  role  by  rolling  100  dice.  In  any  case,  it 
is  clear  that  the  best  result  of  the  SA  algorithm  is  as  close  as  one  can  reasonably  get  to 
optimal,  given  the  random  variation  across  the  chain.  The  effect  of  this  random  variation 
can  be  reduced  in  practice  by  running  each  simulation  for  longer.  In  our  example,  1000 
time  steps  is  considered  sufficient. 


2.4  Cascades  in  Logistics  Supply  Chains 


We  are  interested  in  the  stability  of  the  logistic  supply  chain  to  perturbations.  We  seek 
to  determine  if  small  changes  cascade  through  the  system  or  whether  the  logistics  supply 
chain  is  robust  to  perturbation.  Under  the  SR  policy,  we  investigate  the  stability  of  the 
network  to  a  perturbation  of  size  e  =  mQ  in  consumption  at  the  end  node  L,  where 
m  £  N.  That  is,  e  is  a  non-negative  multiple  of  the  unit  size  for  stock  Q.  If  the  desired 
consumption  C*L  results  in  current  consumption  c*L  and  in  a  resupply  order  o^,  then  the 
desired  consumption  CfL  =  CfL  +  e  produces  a  current  consumption 


Mr1,  ct  >MM 

Cl  &L  <  hi 1 


(17) 


and  c\  =  c\,  for  i  £  N/{L}. 

The  resupply  order  to  node  L  —  1  is 

4  =  Q(min{i?£  -  (hi1  -  cfL),  hl~\  -  4- mod  Q)„ 

<  Q(min{i?£  -  (H  -  (C{  +  e)),  }>l\  -  C£_i,  A; 4i,l }  mod  Q),  (18) 

<  °L  +  £• 


For  the  recurrence  relation, 

o\  =  o\+ 1  +  Q(mm{Hf  -  (hpr1  -  cj),  h\ Z\  -  c\_ i,  mod  Q), 

<  °i+i  +  £  +  Q(mm{Hj  -  (M_1  -  4),  h\ -  c*_i,  A *_!,*}  mod  Q),  (19) 

<  o-  +  e, 


for  i  £  oT’/jO ,L}.  Therefore,  orders  do  not  grow  along  the  logistics  supply  chain  when 
the  system  is  perturbed.  We  now  consider  the  state  of  the  system  at  time  t  +  1  after  a 
perturbation  at  node  L  at  time  t.  The  holdings  at  node  L  is 

hi1  =  min {Wl+\  hi  -  cl 1  +  Sl\L},  (20) 
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where 


hi  =  min {Wl,  htL1-ctL  +  ljL}. 

Note  that 

$i-i,L  —  <  ci  —  ci  ^  hi  <  hi- 


Suppose  IT{+1  is  the  minimal  term  in  equation  (20).  Then 


IT 


t+i 


<  hi  -  4+1  +  4+1, 


<  hi  -  4+1  +  5t+1 


L—1,L' 


(21) 

(22) 


(23) 


Therefore, 

hl+l  =  h 5+1  =  WfL+1.  (24) 

That  is,  the  current  holdings  level  equals  the  warehouse  capacity  regardless  of  whether  the 
system  is  perturbed,  meaning  the  perturbation  dies  out  at  node  L  after  one  time  step. 

Suppose  instead  that  IT£+1  >  hi  —  c^+1  +  60}  \  L.  Then 


oi+1  =  Q(miri{i4(+'  -  (hi  -  c*L+1),  hlL_x  -  4+\,  ^\L}  mod  Q), 

<  Q(nmi{H^  -  (^  -  £  -  4+1),  hi_x  -  c^,  A?\L}  mod  Q),  (25) 

<0^+8. 


This  shows  that  the  SR  ordering  policy  is  stable,  since  any  perturbation  at  node  L  cannot 
grow  either  along  the  chain,  or  over  time.  The  above  proof  can  be  easily  extended  for  nodes 
other  than  L.  Interestingly,  linear  supply  chains  are  not  always  stable.  Bender  (2004) 
has  shown  that  if  orders  are  placed  to  satisfy  a  stockpiling  policy  Hj  =  a*  +  C- ,  where 
at  is  constant,  the  logistics  supply  chain  is  unstable  and  any  perturbation  is  amplified 
exponentially.  Hence,  it  is  possible  that  the  logistics  supply  chain  will  exhibit  chaotic 
dynamics  under  different  ordering  policies  other  than  SR. 

We  also  investigate  the  interdependencies  between  failures  at  different  nodes  in  the  supply 
chain.  For  the  brute  force  enumeration  in  Appendix  A,  the  expected  probability  of  failure 
Ei\P(C^  —  c\  <  —1)]  averaged  over  all  1296  holding  policy  combinations  for  each  node  i 
is  0.0000,  0.0006,  0.0294,  and  0.2669  respectively.  On  average,  most  of  the  failures  (90%) 
occur  at  the  fourth  node  in  the  supply  chain.  However,  while  this  symptom  is  exhibited 
at  node  four,  it  is  not  clear  that  the  holding  policy  at  node  four  is  responsible  for  all 
of  the  failures.  One  way  to  investigate  interdependencies  between  nodes  is  to  remove 
the  first  node  from  the  supply  chain  and  simulate  the  remaining  nodes  with  the  same 
consumption  schedules,  with  node  2  now  connected  directly  to  the  source  node  0.  Because 
this  removes  the  effect  of  the  first  node’s  consumption  and  holding  policy,  the  reduction 
in  Ej\P(Cj  —  cfj  <  —1)],  j  G  {2,3,4}  can  be  considered  to  be  the  component  of  failures 
at  node  j  that  are  caused  by  the  holding  policy  and  consumption  at  node  1.  This  process 
can  be  applied  recursively  to  identify  the  impact  of  each  node  on  downstream  failures. 

Figure  5  depicts  £'j[P(C'{  —  c\  <  —  1)]  for  each  node.  Each  column  in  the  graph  is  segmented 
into  regions  that  represent  the  causal  influence  of  upstream  nodes.  At  node  4,  even  when 
all  other  nodes  are  removed  from  the  logistics  supply  chain,  C\  —  c\  <  —  1  occurs  0.166  of 
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the  time.  In  comparison,  the  activities  of  node  3  add  0.055  to  E/\^{C\  —  c\  <  —  1)],  while 
the  contributions  of  nodes  2  and  1  are  0.030  and  0.016  respectively.  For  node  3,  in  the 
absence  of  nodes  1  and  2,  Cj  —  c|  <  —1  occurs  less  than  0.006  of  the  time,  with  consumption 
and  holdings  in  nodes  2  and  1  contributing  to  0.012  and  0.010  of  failures  respectively.  In 
summary,  the  holding  policy  at  node  4  is  responsible  for  62%  of  the  failures  at  node  4, 
whereas  77%  of  node  3’s  failures  are  due  to  interdependencies  with  upstream  nodes. 

This  process  is  also  performed  in  reverse  to  identify  how  consumption  by  nodes  downstream 
can  cause  failures  to  meet  consumption  upstream  in  later  time  steps.  When  node  4  is 
removed,  E^[F(C^  —  c\  <  —1)]  is  reduced  from  0.029  to  0.008,  so  the  consumption  at  node 
4  causes  0.021  of  failures  at  node  3.  Ei[F(Cj  —  c\  <  —1)]  is  less  than  0.001  for  nodes  1  and 
2,  so  these  values  are  too  small  to  reliably  estimate  the  causal  influence  of  downstream 
nodes.  Of  course,  the  nodes  can  be  removed  in  any  other  permutation,  which  will  show 
different  causal  relations.  However,  the  motivation  for  recursively  removing  from  one  end 
of  the  supply  chain  is  to  determine  the  causal  relation  compared  to  the  distance  between 
two  nodes,  which  is  successively  reduced  by  one  when  the  nodes  are  removed  in  order. 

The  above  analysis  shows  the  impact  of  upstream  consumption  can  be  greatly  magni¬ 
fied  downstream.  However,  the  five  node  supply  chain  is  not  long  enough  to  determine 
whether  the  impact  continues  to  grow  or  eventually  dies  out.  Figure  6  shows  simula¬ 
tion  results  from  a  20  node  logistics  supply  chain.  The  results  are  averaged  over  the 
set  of  holding  policies  within  one  unit  distance  of  the  solution  found  by  SA,  H*  = 
(84,  87,  79, 94,  76,  84, 84,  62,  57, 48, 49, 46, 30,  36,  31,  24, 12, 18,  7).  Only  nodes  11  through 
19  are  graphed,  since  F(Cj  —  c\  <  y)  is  less  than  0.0001  for  nodes  1  through  10,  and  node 
0  is  the  source  node.  This  graph  shows  that  the  proportion  of  failures  due  to  node  1 
across  nodes  10  through  17  is  significantly  greater  than  any  other  node  and  that  if  this 
node  were  to  be  removed  from  the  simulation  then  the  majority  of  the  failures  at  nodes  10 
through  17  would  be  eliminated.  However,  nodes  18  and  19  show  that  the  effect  of  node  1 
reduces,  both  in  absolute  terms  and  as  a  percentage  of  total  failures.  Node  13  also  causes 
magnified  downstream  failures,  which  peak  at  node  18  and  are  smaller  at  node  19.  The 
impact  of  upstream  consumption  has  not  continued  to  grow  unabated  along  the  chain. 
We  hypothesize  this  is  due  to  the  stochastic  nature  of  consumption  in  the  supply  chain. 
Although  an  above  average  consumption  at  some  time  step  upstream  will  have  an  adverse 
effect  downstream,  the  further  downstream  a  node  is,  the  greater  the  probability  is  that 
intermediate  nodes  will  have  below  average  consumption.  The  further  downstream  a  node 
is,  the  more  fluctuations  at  the  start  of  the  chain  are  lost  in  the  noise  of  the  fluctuations 
at  intermediate  nodes.  This  would  act  to  balance  the  apparent  short  range  magnifications 
to  produce  a  peak  that  eventually  dies  out. 

The  other  interesting  feature  of  Figure  6  is  that  only  nodes  1,  13,  17  and  19  cause  significant 
percentages  of  the  failures  in  this  supply  chain.  All  of  these  nodes  share  the  property  that 
Hj  <  Hj+1,  despite  the  fact  that  they  must  supply  at  least  as  much  as  node  i  +  1  every 
turn.  This  suggests  that  these  nodes  are  weak  points  in  the  chain  that  may  provide  the 
greatest  marginal  return  for  increases  in  Hj,  although  we  also  note  that  other  nodes  share 
this  property  yet  do  not  appear  to  be  weak  points.  Therefore,  this  condition  may  be 
necessary  but  not  sufficient  for  identifying  potential  weak  points.  Because  the  SA  policy 
after  a  finite  number  of  iterations  is  not  guaranteed  to  be  optimal,  the  causal  analysis 
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Figure  5:  Average  proportion  of  failures  for  a  five  node  supply  chain  as  nodes  are  removed 
from  the  start  of  the  chain 
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Figure  6:  Average  proportion  of  failures  for  a  20  node  supply  chain  as  nodes  are  removed 
from  the  start  of  the  chain 
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method  we  have  developed  could  potentially  offer  a  constructive  technique  for  improving 
on  the  SA  policy. 


3  Control  Theory 

In  the  previous  section,  the  manner  in  which  goods  are  routed  or  scheduled  throughout 
the  system  is  fixed,  according  to  equations  (6)  and  (7).  Under  this  fixed  scheduling, 
the  holdings  policies  Hf  are  varied,  subject  to  known  constraints,  with  the  objective  to 
minimise  stockouts  and  shortfalls  in  supply.  In  this  section,  we  propose  to  fix  the  holdings 
policies  Hf  and  determine  the  optimal  schedule  for  goods  using  Control  Theory  under  the 
assumption  of  perfect  foresight  in  consumption.  The  performance  of  both  techniques  for 
scheduling  goods  are  then  tested  under  a  relaxation  of  the  perfect  foresight  assumption 
by  incorporating  stochastic  noise  into  predictions  for  consumption  throughout  the  system 
in  both  time  and  space.  Additional  comparisons  with  correlated  consumption  are  also 
conducted  to  stress  the  system. 


3.1  Multistage  Decision  Process 

Consider  a  discrete  process  on  the  state  space  S  with  a  finite  number  of  stages  Af  = 
{1, . . . ,  N}]  that  is,  a  multistage  process  with  finite  horizon.  At  each  stage  in  the  process 
n  £  Af,  a  control  decision  dn  £  V  is  applied.  The  process  obeys  the  transition  function 
Tn  '■  S  xD->5  such  that 

•Sn+l  =  T~n{sn,  dn) ,  Sn  £  S ,  dn  £  TA,n  £  A/" .  (26) 

This  describes  a  deterministic  Markov  decision  process  because  the  transitions  between 
states  are  not  stochastic  and  the  state  at  each  stage  of  the  process  depends  only  on  the 
state  of  the  system  and  a  control  decision  at  the  previous  stage.  A  logistics  network  in 
general  could  also  be  modelled  to  depend  on  an  extended  history  of  the  process,  beyond 
the  previous  stage.  Such  requirements  can  be  built  into  the  decision  process  under  the 
transition  function  (26),  for  dependence  on  an  extended  history  of  finite  length,  with 
a  suitable  interpretation  of  the  state  space  of  the  system  which  takes  into  account  the 
physical  condition  of  the  system  across  multiple  instances  in  time. 

Of  course,  the  set  of  feasible  control  decisions  Tn,s  at  any  given  stage  n  £  AT  also  depends 
on  the  state  of  the  process  s  £  S  so  that  J-n  s  C  V.  Hence,  we  further  require  not  only 
that  dn  £  V  but  also  that  dn  £  J~n.s ,  s  £  S  for  every  n  £  AT.  Also  note  that  we  can  define 
V  =  U neAf,seS^n,s  so  that  the  set  of  control  decisions  T>  contains  only  feasible  states. 

At  each  stage  n  £  Af  the  process  is  associated  with  a  return  function  vn  :  S  x  T>  — ►  M. 
The  value 

^  ^  'f  ^n(®ni  ^n)i  (27) 

n£j\f 

denotes  the  total  return  of  the  process  over  all  stages. 

Refer  to  Figure  7  for  an  illustrative  description  of  the  interactions  in  a  single  stage  of  the 
multistage  decision  process. 
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Figure  7:  Multistage  decision  process 


A  deterministic  multistage  decision  process  (Sniedovich,  1992,  pp. 31-33)  is  then  defined 
as  the  6-tuple  D  =  (A f,S,F,T,v,s)  where  s  denotes  the  initial  state  of  the  process  with 
the  objective  to 


maximise 

V 

~  XneA/-  vn{sni  dn), 

subject  to 

dn 

£  n,Sni 

n  G  A7, 

Sn+1 

—  7"n(Sji,  dn) , 

n  G  A7, 

Si 

=  S. 

(28) 


That  is,  given  a  multistage  decision  process  D ,  on  a  finite  horizon  of  N  stages  with  initial 
state  s,  determine  the  sequence  of  controls  di,  i  =  1 . . .  N,  that  maximises  the  value  of  the 
process  v.  We  define  this  sequence  of  controls  as  an  optimal  policy  of  D. 


3.2  Dynamic  Programming 

Bellman’s  Principal  of  Optimality  (Bather,  2000,  p.18)  states  that 

An  optimal  policy  has  the  property  that,  whatever  the  initial  state  and  initial 
decision  are,  the  remaining  decisions  must  constitute  an  optimal  policy  with 
regard  to  the  state  resulting  from  the  first  decision. 

This  result  means  that,  if  the  initial  control  decision  of  a  multistage  decision  process  is 
contained  within  an  optimal  policy  for  that  process,  then  the  remaining  control  decisions 
form  an  optimal  policy  for  the  remainder  of  the  multistage  decision  process  with  the  input 
state  resulting  from  that  initial  control  decision.  Bellman  (1952,  1957)  used  this  principle 
to  define  a  method  of  solving  the  multistage  decision  process  using  recursion.  That  is, 
starting  at  the  final  stage  and  last  state  of  the  process,  work  backwards  to  identify  an 
optimal  policy  for  the  process.  Hence,  Bellman  solved  the  multistage  decision  process 
as  a  number  of  sub-processes  in  which  each  process  extended  the  optimal  policy  of  the 
previous  process  until  the  optimal  policy  of  the  original  multistage  decision  process  had 
been  calculated.  This  procedure  is  called  Dynamic  Programming,  denoted  earlier  DP. 
Refer  to  Boudarel  et  al.  (1971,  pp. 11-12)  for  a  direct  derivation  of  mathematics  of  the 
method. 
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To  apply  Bellman’s  method,  recursively  define  the  optimal  value  vn(sn )  and  optimal  con¬ 
trol  dn,  over  the  stages  n  through  N,  for  the  multistage  decision  process  D ,  as 


On(s?i)  —  max  (on(sn,dn)  +  vn-\ -i(Tn(sn,  dn)),  n  G  J\f , 

dn  n,s-n 


(29) 


and 


dn  =  arg  max  (vn(sn,dn)  +  vn+i (rn(sn,dn)),  n  G  A7,  (30) 

dn  n,sn 

respectively,  where  L/v-i-i^n)  =  0. 

Then,  to  compute  an  optimal  policy  of  D ,  we  need  only  to  recursively  compute  dn ,  sn  and 
On- 


For  example,  suppose  that  a  supply  node  is  provided  with  a  schedule  of  consumption  xn 
over  the  next  four  months  n  =  1 ...  4.  The  node  initially  has  no  stock  in  hand  but  does  have 
an  arbitrarily  large  warehouse  capacity.  However,  the  node  is  penalised  for  carrying  over 
stock  from  previous  months.  A  monthly  penalty  is  invoked  for  holding  excess  unused  stock. 
Furthermore,  the  supply  node  is  unable  to  regulate  changes  in  deliveries  easily.  Agencies 
delivering  stock  to  the  supply  node  charge  a  penalty  for  varying  the  delivery  schedule 
between  months.  Then  our  Dynamic  Program  D  becomes  a  minimisation  problem  and  we 
associate  the  return  function  with  the  penalty  function. 

Let  the  consumption  in  the  first,  second,  third  and  fourth  months  be  x\  =  3,  =  0, 

X3  =  1  and  X4  =  1  units  of  goods  respectively.  Let  the  state  of  the  process  during  stage  n 
be  defined  by  the  tuple  (hn,  on-i),  where  hn  denotes  the  stock  holdings  carried  over  in  the 
warehouse  from  the  stage  n  —  1  and  on_i  denotes  the  delivery  received  by  the  warehouse 
during  stage  n  —  1.  No  stocks  are  carried  over  from  n  =  0  and  no  deliveries  are  received 
in  n  =  0.  Then,  let 

vn  =  100/in  +  300|on-on_i|,  (31) 

so  that 


vn(hn,  on- i)  =  min  (100/?.n  +  300|y  -  on_i|  +  vn+i(K  +  y  ~  xn,  y )).  (32) 

y>Xn-h„ 


Table  2:  Calculations  for  example  Dynamic  Program 


Stage  4  (4th  month) 

Stage  3  (3rd  month) 

Stage  2  (2nd  month) 

Stage  1  (1st  month) 

03 

v\ 

di 

h-3 

02 

03 

d3 

h'2 

Ol 

02 

d2 

hi 

oo 

Ol 

d\ 

0 

0 

300 

1 

0 

0 

300 

1 

0 

3 

1100 
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0 

0 

2000 

3 

0 

1 

0 

1 

1 

0 

400 

0 

1 

4 

1600 

1 

1 

0 

100 

0 

1 

1 

500 

1 

2 

5 

2000 

0 

1 

1 

400 

0 

2 

0 

300 

0 

1 

2 

700 

0 

2 

1 

600 

0 

2 

2 

900 

0 

The  optimal  policy  of  our  example  is  displayed  in  Table  2  in  bold  italic  numerals.  This 
policy  is  interpreted  as: 

•  ordering  3  units  goods  in  the  first  month,  supplying  3  units  goods  and  emptying  the 
warehouse; 
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•  ordering  1  unit  goods  in  the  second  month  and  holding  this  unit  in  the  warehouse; 

•  ordering  1  unit  goods  in  the  third  month,  supplying  1  unit  goods  and  holding  1  unit 
in  the  warehouse;  and 

•  supplying  1  unit  goods  and  emptying  the  warehouse  in  the  fourth  and  final  month. 


3.3  Logistics  Supply  Chain 


The  multistage  decision  process,  for  this  general  logistics  supply  network  of  Section  2.1,  is 
posed  as  follows.  Given  Hj,  Wf,  Cf,  Q,  and  A*  •  determine  the  Sjj ,  7*  ■,  hj,  and  cj  to 


min  Y,  *(i,t)  =  A\Hti-hj\a  +  B\Cti-<*f, 
ie^V/{o},te& 

subject  to  hj  =  min {Wf,  h\rx  -  c\  -  E#i  +  E <%i>, 


t 

f  M_1: 

.  7 

> 

h*-1 

ci  = 

1  Cl 

cl 

< 

h1-1 

ni  1 

1  y 

hj 

< 

h*-1- 

c\,  i 

G 

t 

G 

0  < 

y . 

hj 

<  A*  ■ 

—  hj 

,  i 

,3 

G  JV, 

t 

G 

S*  • 

= 

Qi{j, 

i 

,3 

G  JV, 

t 

G 

= 

lH?\, 

i 

G 

7^  • 

G 

N, 

i 

,3 

G  jV, 

t 

G 

S'. 

i  G  ^r/{0},  t  G  AT, 
i  G  ^r/{0},  t  G  AT, 


(33) 


DP  is  well-suited  to  calculating  an  optimal  policy  for  the  controls  in  a  multistage  decision 
process  for  a  linear  supply  chain  -  that  is,  solving  (33)  for  the  network  topology  of  a 
linear  supply  chain.  By  the  very  definition  of  optimality,  DP  finds  policies  that  are  both 
temporally  and  spatially  efficient.  The  DP  model  anticipates  future  consumption  for  goods 
and  holds  those  goods  dispersed  along  the  chain,  appropriately  displaced  in  time,  to  best 
meet  the  consumption  and  hence  maximise  the  difference  in  the  penalties  accrued  for 
holding  excess  stock  and  the  penalties  accrued  for  failing  to  meet  consumption. 


To  better  examine  the  impact  of  the  penalty  function  on  the  optimal  policy  for  the  DP 
model,  consider  a  four  node  network  with  warehouse  capacity  of  ten  and  holdings  policy 
of  four,  over  four  stages,  with  a  desired  consumption  of  ten  unit  goods  in  the  final  two 
stages  of  the  process  at  node  three.  For  this  experiment,  the  value  of  the  optimal  policy  is 
not  important  and  only  one  optimal  policy  is  provided,  although  the  nature  of  all  optimal 
policies  is  discussed. 

Let  h1  =  (hj),  i  =  1, . . . ,  3  and  let  h  =  (h*),  t  =  1, . . . ,  4.  Equation  (5)  with  A  =  B  =  1 
and  a  =  (3  =  2  results  in  three  solutions,  one  of  which  is 


h=  ((5,4,4),  (4,6,7),  (4,4,6),  (4,4,4)).  (34) 

This  solution  is  displayed  in  Figures  8  through  11.  In  these  figures,  the  arcs  denote  goods, 
either  transferred  between  nodes,  consumption  or  holdings  carried  from  the  previous  stage. 
These  figures  are  not  to  be  confused  with  Figure  7  in  which  arcs  between  nodes  denote 
stages,  the  input  arc  denotes  a  control  decision,  and  the  output  arc  denotes  a  return  value. 
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This  optimal  policy  presents  a  demonstration  of  the  temporal  and  spatial  properties  of 
the  system.  Notice  that  equation  (34)  results  in  a  stockpile  of  7  units  at  node  4  in  stage  2 
to  meet  the  consumption  of  10  (with  a  shortfall  of  3  units)  in  the  subsequent  time  period. 
This  choice  of  stockpiling  7  units  rather  than  10  is  a  direct  result  of  the  penalty  function. 
It  costs  (7  —  4)2  =  9  units  to  hold  the  7  units  (an  excess  of  3  units  on  the  desired  holdings 
level)  over  a  stage  and  it  costs  (10  —  7)2  =  9  units  in  failing  to  meet  the  target  quota  for 
consumption  of  10  units  (a  shortfall  of  3  units  on  the  desired  level).  Notice  that  these 
two  cost  terms  are  actually  invoked  in  different  stages.  The  total  of  these  two  terms  is  18 
units  with  an  additional  once  off  cost  of  1  unit  due  to  the  holdings  at  the  1st  stage.  It  is 
easy  to  see  that  this  is  the  optimum  behaviour  because  it  costs  (8  —  4)2  +  (10  —  8)2  =  20 
units  to  stockpile  8  units  instead  of  7  and  (6  —  4)2  +  (10  —  6)2  =  20  units  to  stockpile  6 
units  instead  of  7.  Both  of  these  options,  and  indeed  all  other  options,  are  worse  than  the 
optimum.  Next,  equation  (34)  indicates  that  6  units  of  goods  are  stockpiled  at  node  3  (for 
node  4)  in  stage  2  (for  consumption  in  stage  4).  This  costs  22  +  22  =  8  units  in  holdings 
over  two  stages  and  42  =  16  units  in  failing  to  meet  target  consumption  levels.  The  total 
cost  is  then  24  units.  Again,  stockpiling  any  amount  of  goods  other  than  6  units  results 
in  a  higher  total  cost  (eg  stockpiling  7  goods  results  in  a  total  cost  of  32  +  32  +  32  =  27 
and  stockpiling  5  goods  results  in  a  total  cost  of  l2  +  l2  +  52  =  27). 

In  a  military  context,  small  stockpiles  of  goods  are  desirable  because  they  have  the  effect 
of  reducing  the  logistics  footprint  compared  with  the  larger  footprint  under  increased 
stockpiling.  Infrastructure  such  as  warehouses  is  also  costly  to  establish  and  maintain. 
Furthermore,  keeping  large  inventory  levels  with  idle  stock  is  wasteful  of  resources.  Hence, 
A,  B ,  a  and  /3  in  equation  (5)  are  important  parameters  in  determining  the  penalty  invoked 
for  stockpiling.  We  provide  three  examples  to  illustrate  the  types  of  policies  obtained  under 
different  parameterisations  of  the  model. 

Suppose  that,  A  =  B  =  1  and  a  =  (5  =  1  in  equation  (5).  Then  there  are  five  solutions, 
one  of  which  is 

h=  ((4,4,4),  (4,4,4),  (4,4,4),  (4,4,4)).  (35) 

This  optimal  policy  simply  meets  the  warehouse  holding  policy  levels  of  4  units  at  each 
stage.  The  shortfall  in  consumption  is  not  ignored,  there  is  merely  no  advantage  to  be 
gained  in  stockpiling  goods  because  the  penalty  invoked  in  holding  excess  goods  exactly 
equals  the  penalty  invoked  in  failing  to  meet  desired  consumption.  Likewise,  there  is  no 
advantage  to  be  gained  in  not  stockpiling  goods.  Hence,  the  other  four  solutions  have 
a  holdings  level  of  1,  2,  3,  and  4  units  respectively  at  node  4  in  the  2nd  stage  (to  be 
consumed  in  the  3rd  stage).  It  is  not  efficient  to  stockpile  goods  over  2  stages,  as  it  was  in 
the  previous  example,  because  any  penalty  x  is  invoked  twice  for  holdings  above  4  units 
but  the  shortfall  in  consumption  10  —  4  —  x  is  invoked  only  once  so  that  the  net  penalty 
actually  increases  by  x  units.  Hence,  none  of  the  five  optimal  policies  attempts  to  stockpile 
any  additional  goods  for  the  consumption  at  node  4  in  stage  4. 

When  A  =  B  =  1,  a  =  2  and  j3  =  1  the  solution  (35)  is  again  optimal.  However,  in  this 
example  there  is  only  one  additional  optimal  policy 

h=  ((4,4,4),  (4,4,5),  (4,4,4),  (4,4,4)).  (36) 

Here  it  is  costly  to  stockpile  goods,  hence  the  best  policy  is  to  simply  meet  target  warehouse 
levels.  The  solution  (36)  is  optimal  because  l2  =  1  so  the  squared  term  for  holding  excess 
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Figure  9:  Linear  supply  chain  stage  t  =  2 


Figure  10:  Linear  supply  chain  stage  t  =  3 


Figure  11:  Linear  supply  chain  stage  t  =  4 
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stock  is  exactly  equal  to  the  linear  term  for  shortfall  in  consumption  only  when  the  amount 
of  goods  is  unity. 

IfA  =  R  =  l,a  =  l  and  (3  =  2  we  have  twenty-eight  solutions,  including 

h=  ((8,5,4),  (4,8,9),  (4,4,8),  (4,4,4)).  (37) 

All  solutions  involve  large  transfers  of  goods.  In  this  scenario,  the  cost  of  failing  to  meet 
consumption  dominates  the  cost  of  overstocking  the  warehouses.  However,  even  here  no 
node  stocks  more  than  9  units  of  goods  at  any  stage. 


4  Comparison  of  Systems  under  Uncertainty 

and  Correlation 

4.1  Degradation  of  Optimality  under  Uncertainty 

In  this  section,  we  propose  that  the  consumption  schedule,  Cf,  i  €  JZ ,  t  €  is  uncertain. 
Instead,  we  have  an  a  priori  estimate  Cf  of  the  true  consumption  Cf.  In  the  last  section, 
this  estimate  coincided  with  the  realisation  so  that  Cf  =  Cf.  Here  we  use 

Cf  =  max{0,  Cf  +  ef}  (38) 

where  e\  is  an  error  term. 

In  this  study,  we  generate  the  error  term  e\  according  to  a  Bernoulli  sequence  as  follows. 
First,  e\  is  initialised  to  0.  Define  the  random  variable  Z  with  two  outcomes,  either 
success  with  fixed  probability  p  or  failure  with  probability  1  —p.  Then,  a  perturbation  is 
repeatedly  introduced  to  e\  by  repeatedly  adding  a  uniformly  distributed  random  integer 
over  -3  to  3  so  long  as  the  random  variable  Z  =  success.  Refer  to  Appendix  C  for  an 
illustrative  depiction  of  the  relationship  between  the  value  of  the  error  term  e\  =  e  and 
the  probability  that  it  is  observed  for  values  of  p  between  0.1  and  0.6. 

Under  uncertainty,  the  T  stage  DP  model  is  solved  in  multiple  instances.  Initially,  one 
complete  T  stage  optimal  policy  is  computed  using  the  inexact  approximations  Cf  to  Cf. 
The  true  values  of  Cf  are  then  revealed.  With  this  information,  an  optimal  policy  for  the 
remaining  T  —  1  stages  is  re-computed  using  an  appropriately  adjusted  Dynamic  Program 
and  the  true  values  of  Cf  are  revealed.  This  step  process  of  solution  and  re-computation 
is  repeated  until  all  values  Cf , ... ,  Cf  are  known. 

Six  logistics  chains  are  compared  in  Figure  12  each  of  length  4  with  consumption  Cf 
generated  on  the  integers  between  0  and  3.  In  each  case,  the  three  numbers  after  the  initial 
designator  DP  and  SR  describe  the  holdings  polices  of  nodes  1  through  3  respectively.  Each 
chain  is  independently  simulated  100  times  for  15  time  steps.  The  cost  or  penalty  of  each 
system  is  measured  according  to  equation  (5)  with  A  =  B  =  1  and  a  =  (3  =  2.  Figure  12 
illustrates  the  outcome  of  this  comparison  for  values  of  p  between  0  and  0.6  in  increments 
of  0.1.  The  error  bars  describe  99%  confidence  intervals. 


There  are  infinitely  many  possible  logistics  networks  and  ways  to  introduce  error  and 
uncertainty  into  those  systems.  Hence,  Figure  12  does  not  itself  tell  us  much  about  the 
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Figure  12:  Comparison  of  average  penalty  for  SR  and  DP  algorithms  over  the  level  of 
error  in  prediction 
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Figure  13:  Comparison  of  average  penalty  for  SR  algorithm  with  independent  and  corre¬ 
lated  consumption  generated  from  Uniform,  Normal  and  Exponential  distributions 
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interplay  between  the  two  approaches  across  all  possible  logistics  networks.  However, 
certain  enduring  features  do  arise.  The  DP  model  is  optimal  only  when  p  =  0.  The 
utility  of  this  approach  decreases  as  uncertainty  increases.  In  comparison,  the  SR  system 
of  Section  2.2  has  no  concept  of  estimating  future  consumption  and  is  in  that  sense  robust 
to  uncertainty.  For  all  logistics  systems  studied,  one  is  presented  with  the  problem  of 
determining  which  of  the  two  approaches  is  most  appropriate  given  the  level  of  uncertainty 
in  future  consumption.  In  general,  we  observe  that  the  SR  system  performs  best  when  the 
holdings  policies  are  set  relatively  large.  In  fact,  when  running  DP  with  policies  (7,  5,  3), 
there  is  little  visible  difference  in  curves  between  simulations  with  policies  (6,  4,  3)  while 
the  performance  of  the  SR  (7,  5,  3)  system  is  substantially  improved.  On  the  other  hand, 
DP  handles  situations  in  which  the  holdings  policies  are  relatively  low  substantially  better 
than  SR  dynamics.  While  both  approaches  benefit  from  increased  holdings  policies,  the 
marginal  utility  gained  by  increasing  holdings  policies  is  different.  In  both  approaches, 
where  the  relative  benefit  of  increasing  holdings  policies  are  negligible,  the  system  has 
reached  a  “saturation”  point  in  which  desired  consumption  is  always  or  almost  always 
satisfied  and  any  penalty  incurred  is  due  to  a  failure  to  resupply  the  system  to  the  desired 
holdings  levels. 


4.2  Degradation  of  Optimality  under  Correlation 

In  this  section,  the  uncertain  consumption  schedule  Cj,  i  £  ,/F,  t  £  3F  is  further  relaxed 
to  disregard  the  assumption  of  independence  of  consumption  at  different  nodes.  This  is 
motivated  by  the  observation  that  in  complex  systems,  correlations  between  nodes  in  a 
network  can  result  in  unexpected  behaviour.  One  source  of  correlation  may  be  that  the 
consumption  at  each  node  in  the  supply  chain  occurs  on  the  same  battlefield.  Then  during 
a  major  attack,  high  consumption  at  one  node  could  be  experienced  at  the  same  time  as 
high  consumption  at  other  nodes.  There  exists  both  historical  and  simulation  evidence 
that  power  laws  exist  in  casualty  statistics  of  warfare  (Richardson,  1941,  1960;  Roberts 
and  Turcotte,  1998;  Lauren,  2001),  which  could  indicate  correlation  in  the  activities  of 
combat  units.  Since  these  units  are  responsible  for  setting  the  consumption  schedule  it 
is  important  to  investigate  the  effect  of  correlations  on  the  performance  of  the  logistics 
supply  chain. 

We  compare  the  performance  of  the  DP  and  SR  algorithms  under  independent  and  cor¬ 
related  consumption  generated  by  Uniform,  Normal  and  Exponential  distributions.  The 
independent  consumption  schedules  with  uncertainty  are  calculated  as  per  Section  4.1, 
where  each  node  draws  a  random  number  from  a  distribution  and  adds  error  e\.  The 
three  cases  investigated  are  Xt  ~  {t/[0, 5],  JV(2.5,  2),  £'(2.5)},  which  have  equal  means. 
The  correlated  consumption  is  generated  by  a  single  distribution  for  the  entire  chain,  that 
when  divided  amongst  the  L  nodes  is  indistinguishable  from  the  independent  consumption 
schedules,  with  the  exception  that  correlation  between  any  two  nodes  is  almost  one1.  For 
L  =  3  this  gives  the  distributions  X  ~  {t/[0, 15],  IV(7.5,  \/l2),  E( 7.5)}.  Total  consumption 
when  xt  is  drawn  from  X  is  xt  +  £t,  and  consumption  at  node  i  equals 


1The  correlation  is  not  exactly  1  because  L  may  not  divide  a:4  €  X  exactly. 
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,  fL^J,  i*K, 

1  I  r^*+£*1  „■  c  'T7 


where  TZ  is  the  set  of  randomly  chosen  nodes  that  are  allocated  an  extra  unit  of  consump¬ 
tion  from  the  remainder  of  the  Integer  division. 

For  both  DP  and  SR,  for  each  distribution  the  (7,  5,  3)  holding  policy  performs  better  than 
(6,  4,  3)  and  (4,  4,  4)  for  all  values  of  et.  Therefore,  we  limit  the  following  comparisons  to 
the  (7,  5,  3)  holding  policy.  Figure  13  shows  the  average  penalty  for  the  SR  algorithm  with 
independent  and  correlated  consumption  distributions  over  1000  replications,  where  each 
supply  chain  is  simulated  for  1000  time  steps.  For  every  distribution,  the  penalty  increases 
nonlinearly  with  error,  such  that  the  rate  of  increase  in  penalty  grows  with  the  size  of  e.  All 
independent  and  identically  distributed  (denoted  IID  in  Figure  13)  consumption  schedules 
perform  significantly  better  than  correlated  consumption  schedules.  On  average,  the  SR 
algorithm  has  a  penalty  of  131  for  independent  consumption,  which  more  than  doubles 
to  288  with  correlated  consumption.  This  is  despite  the  fact  that  total  consumption  of 
the  independent  and  correlated  supply  chains  have  identical  distributions  -  it  is  only  the 
way  consumption  is  distributed  to  individual  nodes  that  varies.  The  complete  correlation 
modelled  here  represents  an  extreme  example  of  correlation,  and  serves  to  quantify  an 
upper  bound  on  the  effect  of  correlations  between  nodes  in  a  logistics  supply  chain. 


Figure  13  also  contains  insights  into  the  effect  of  different  types  of  distributions  on  supply 
chain  performance.  For  a  given  level  of  e,  for  both  correlated  and  independent  cases  the 
following  ordering  holds  within  the  99%  confidence  intervals:  the  Uniform  penalty  is  less 
than  or  equal  to  Normal  penalty,  which  is  strictly  less  than  the  Exponential  penalty.  This 
is  intuitive,  since  a  major  contributor  to  the  penalty  is  when  the  consumption  level  exceeds 
the  holding  level  at  a  node.  The  Uniform  distribution  is  bounded,  which  bounds  the  size 
of  consumption  at  any  time  step.  The  Normal  distribution  has  a  thin  tail  compared  to  the 
Exponential  distribution,  meaning  very  large  consumption  is  much  more  likely  with  an 
Exponential  distribution.  Interestingly,  for  independent  nodes,  the  performance  is  signifi¬ 
cantly  better  with  the  Uniform  distribution,  while  the  penalty  is  similar  between  the  Expo¬ 
nential  and  Normal  distributions.  However,  when  the  node  consumption  is  correlated,  the 
penalty  is  not  significantly  different  between  the  Uniform  and  Normal  distributions,  which 
are  both  considerably  better  than  the  Exponential  distribution.  Therefore,  the  penalty 
with  the  Normal  distribution  is  bounded  by  the  Uniform  and  Exponential  distributions, 
and  varies  within  this  range  as  a  function  of  the  degree  of  correlation. 

The  performance  of  the  DP  and  SR  with  Normal  and  Exponential  correlated  consumption 
schedules  is  compared  in  Figure  14.  The  results  for  the  Uniform  correlated  distribution  is 
not  shown  because  there  is  no  significant  difference  to  the  Normal  correlated  consumption 
performance  for  either  algorithm.  In  all  cases,  DP  performs  significantly  better  than  SR. 
The  performance  of  both  algorithms  is  significantly  worse  when  the  total  consumption 
across  the  supply  chain  is  generated  from  the  Exponential  distribution,  even  though  the 
mean  consumption  is  the  same  as  the  Normal  and  Uniform  consumption  schedules. 
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The  error  bars  for  the  SR  data  in  Figure  14  are  much  smaller  and  the  curves  are  smoother 
because  the  results  are  averaged  over  1000  time  steps  compared  to  only  15  time  steps 
for  DP.  When  the  SR  data  is  run  for  only  15  time  steps  the  penalty  function  is  slightly 
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Figure  14'-  Comparison  of  average  penalty  for  SR  and  DP  algorithms  when  node  consump¬ 
tion  is  correlated 


underestimated  (although  the  1000  step  results  are  still  well  within  the  confidence  intervals 
for  the  15  step  data),  since  the  initial  conditions  h ?  =  Vi  produces  upward  bias  in 
the  steady  state  estimate  of  the  expected  penalty.  DP  is  only  run  for  15  time  steps 
because  its  computational  complexity  prevented  solving  a  1000  step  problem  within  a 
reasonable  amount  of  time  with  the  available  computing  resources.  One  way  to  improve 
the  performance  of  DP  is  to  consider  a  finite  time  horizon  of  say  15  time  steps,  and  compute 
a  solution  to  the  1000  step  problem  by  successively  solving  the  problem  for  segments  of  15 
steps.  Another  option  is  to  consider  solving  the  DP  model  for  disjoint  subsets  of  nodes, 
where  each  subset  must  set  its  ordering  policy  independently  of  nodes  from  other  subsets. 
Both  approaches  have  reduced  time  complexity  and  can  be  considered  approximations  to 
the  full  DP  solution.  Neither  of  these  methods  are  implemented  in  this  study,  since  the 
15  time  step  data  provides  statistically  significant  results  for  comparisons  with  the  SR 
algorithm. 


5  Discussion 


Time-invariance  is  assumed  in  Section  2.3  for  tractability  and  ease  of  analysis.  This 
assumption  is  reasonable  in  a  subset  of  all  possible  systems  in  the  real  world,  those  being 
systems  whose  underlying  distributions  for  consumption  do  not  change  in  time.  Cyclic 
phenomenon  or  other  more  complex  system  dynamics  are  not  captured.  For  example, 
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consumption  in  a  logistics  system  varies  in  time  but  the  way  in  which  it  varies  may 
fundamentally  change  between  day  and  night.  A  day  and  night  scenario  can  be  broken 
into  two  independent  time-invariant  systems,  one  for  daytime  and  one  for  nighttime.  At 
the  other  extreme,  a  system  whose  distributions  vary  each  and  every  new  time  step  may  to 
all  practical  purposes  be  sufficiently  random  to  be  adequately  modelled  as  a  time-invariant 
system,  although  this  is  by  no  means  guaranteed.  Between  simple  cyclic  behaviour  and 
total  unpredictability  lies  system  dynamics  which  are  partially  predictable  and  therein 
display  trend-like  behaviour  yet  are  sufficiently  complex  to  defy  routine  analysis.  In  such 
systems,  adaptive  learning,  or  a  similar  technique,  is  required  to  teach  the  system  how  to 
detect,  anticipate  and  react  to  changes  within  the  system. 

The  formulation  for  the  logistics  network  of  Section  2.1  includes  a  maximum  link  capacity 
A*  ■  between  nodes  i  and  j,  and  a  maximum  warehouse  capacity  W%,  for  nodes  k  and 
time  t.  These  capacity  constraints  are  used  in  this  report  indirectly  and  are  somewhat 
peripheral  to  our  results.  Future  applications  of  this  research  include  a  direct  study  of 
the  effects  of  link  and  warehouse  capacities  by  including  additional  cost  terms  to  the 
penalty  function  (5).  This  future  work  complements  our  study  with  a  focus  on  logistical 
network  design  rather  than  policy  setting  and  optimal  routing  in  existing  systems.  Further 
extensions  include  varying  the  stock  multiplier  Q,  which  describes  the  container  or  parcel 
size.  The  benefits  and  costs  between  various  methods  of  resupply,  which  deliver  stock  in 
different  base  quantities,  can  then  be  gauged  against  the  number  of  packages  required  and 
the  frequency  of  delivery  required. 

This  study  identified  interesting  nonlinear  dynamics  that  could  be  explored  in  further 
detail.  In  particular,  the  phase  transitions  observed  in  the  Lagrangian  relaxation  were 
peripheral  to  the  scope  of  this  report  but  suggest  the  potential  for  further  research  into 
the  general  existence  and  cause  of  phase  transitions  in  simple  linear  supply  chains.  Due  to 
time  limitations,  only  a  20  node  supply  chain  was  tested  for  correlation  and  downstream 
effects,  but  larger  empirical  studies  using  the  methods  developed  in  Section  2.4  may  lead 
to  more  conclusive  results.  The  SR  model  may  show  improved  performance  if  adapta¬ 
tion  is  incorporated  using  a  machine  learning  technique  such  as  Reinforcement  Learning. 
An  adaptive  model  would  also  be  a  more  accurate  representation  of  the  SR  logistics  con¬ 
cept.  An  obvious  extension  is  to  consider  logistics  networks  rather  than  only  linear  supply 
chains.  It  is  not  obvious  whether  a  network  is  more  or  less  vulnerable  to  uncertainty  and 
correlation,  or  what  network  topology  provides  the  best  performance  in  different  contexts. 
This  study  would  provide  a  useful  baseline  for  any  research  into  more  general  network 
models  of  SR  logistics. 

A  limitation  to  the  use  of  DP  is  the  underlying  assumption  of  perfect  foresight.  Put  simply, 
the  future  consumption  is  known  or  estimated  a  priori.  This  assumption  is  reasonable  for 
capturing  predictable  behaviour.  For  example,  over  the  duration  of  a  day,  a  force  in 
combat  may  expend  a  well  known  quantity  of  munitions.  It  may  be  reasonable  in  such 
a  situation  to  assert  that  the  expenditure  of  munitions  today  will  be  similar  to  that  of 
yesterday  or  perhaps  a  weighted  average  of  the  expenditure  over  the  last  week.  This  is 
called  naive  and  adaptive  expectations  respectively  (Lucas,  1986).  Perfect  foresight  is  also 
reasonable  in  situations  of  routine  delivery,  such  as  supplying  provisions  for  infantry,  and 
any  kind  of  planned  activity,  such  as  a  planned  offensive  attack,  where  one  can  estimate 
the  requirement  for  medical  supplies  with  reasonable  certainty.  In  any  case,  in  conducting 
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a  planned  offensive  attack,  ordinance,  medical  supplies,  provisions  and  other  goods  are 
ordered  and  stored  in  advance  and  not  upon  action  due  to  the  non-zero  delay  in  logistical 
support.  Finally,  DP  is  reasonable  to  use  when  rationing  goods.  In  this  situation,  the 
true  real  demand  for  goods  exceeds  the  actual  preferred  consumption  level  set  in  the  DP 
model.  Consumption  across  the  network  is  deliberately  capped. 

A  three  faceted  approach  to  the  study  of  logistics  supply  chains  is  recommended  for  a 
balanced  program  of  work  across  a  broader  context.  This  report  delivers  one  of  these 
approaches  by  exploring  the  use  of  the  techniques  from  statistics,  control  theory  and 
optimisation.  The  remaining  two  recommended  studies  employ  classical  network  theory 
and  queuing  theory  respectively.  In  the  first  instance,  transportation  problems,  multi¬ 
commodity  flow  problems  and  scheduling  problems  have  the  potential  to  generate  valuable 
insights  when  appropriately  modelled  in  a  logistics  context.  In  the  second  instance,  metrics 
such  as  average  occupancies,  mean  waiting  times  and  stationary  distributions  are  useful  in 
measuring  performance  in  logistics  systems.  The  union  of  these  three  studies  then  comprise 
a  well-rounded  and  balanced  research  program  designed  to  deliver  coherent,  relevant  and 
useful  recommendations  to  the  logistics  community. 


6  Conclusions 

This  study  has  presented  a  foundation  for  the  analysis  of  logistics  supply  chains  without 
back-orders.  Two  models  were  developed.  First,  a  purely  reactive  demand-oriented  model 
was  formulated  without  any  prediction  or  estimation  of  future  demand.  This  model  was 
used  to  determine  policies  for  holdings  levels  throughout  the  chain  which  minimise  the 
probability  of  stock-outs  and  shortfalls  in  supply.  Second,  a  deliberately  planned  supply- 
oriented  model  was  formulated.  This  model  was  used  to  determine  the  optimal  supply 
regime  which  minimises  the  cost  associated  with  over  and  under  supply.  Finally,  a  com¬ 
parison  between  the  two  methods  was  conducted  where  demand  is  uncertain  and  cannot 
be  accurately  predicted. 

It  is  concluded  that  both  the  demand  oriented  and  supply  oriented  models  are  useful 
in  context  of  the  predictability  and  uncertainty  of  future  demand  in  the  system.  While 
the  demand  oriented  model  exhibits  far  from  optimal  performance,  its  utility  or  benefit 
does  not  depend  on  being  able  to  predict  future  demand.  Conversely,  the  optimal  supply 
regime  in  the  supply  oriented  model  explicitly  requires  an  accurate  forecast  of  future  de¬ 
mand  throughout  the  system.  The  utility  or  benefit  obtained  in  using  demand-oriented 
mechanics  substantially  decreases  when  the  holdings  policies  in  the  system  limit  through¬ 
put.  However,  while  supply  oriented  techniques  are  superior  in  efficiency  and  effectiveness 
in  these  cases,  there  is  a  point  at  which  uncertainty  in  the  system  becomes  excessive.  For 
any  real  logistics  system,  identifying  the  boundary  between  the  techniques  in  terms  of 
uncertainty  and  sufficient  throughput  is  necessary  to  determine  which  of  the  two  options 
is  most  appropriate  and  relevant. 

Correlations  between  demand  at  different  nodes  in  a  supply  chain  can  significantly  reduce 
the  performance  of  the  system,  which  highlights  the  need  to  incorporate  a  level  of  buffering 
in  warehousing  policies  to  counter  this  effect  when  correlations  are  possible.  The  statistical 
nature  of  demand  is  also  an  important  factor  in  the  performance  of  the  system,  and 
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additional  buffering  is  recommended  for  logistics  supply  chains  when  statistical  inference 
from  historical  data  indicates  demand  is  fat-tailed. 

The  results  of  this  study  do  not  directly  contradict  the  ideas  presented  in  the  Australian 
Defence  Force’s  Future  Joint  Logistics  Concept  (Commonwealth  of  Australia,  2002)  but 
instead  serve  as  a  warning  that  responsive  systems,  simple  or  complex,  will  not  and  can¬ 
not  entirely  replace  traditional  planned  systems  and  that  the  problems  associated  with  the 
design,  implementation  and  analysis  of  adaptive  precision  systems  are  non-trivial.  This 
study  tentatively  supports  the  conclusion  that  enhanced  capabilities,  emerging  technolo¬ 
gies  and  network-enabled  warfare  will  ultimately  facilitate  better  management  and  control 
of  logistics  systems.  However,  it  does  so  on  the  basis  that  these  new  developments  will 
reduce  uncertainty  across  the  system  and  therein  have  an  indirect  effect  on  the  system 
rather  than  directly  supporting  better  logistics  supply.  Further  work  at  the  Defence  Sci¬ 
ence  and  Technology  Organisation  is  currently  underway  to  explore  these  issues  in  greater 
detail 
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Appendix  A  Geometry  of  the  State  Space 

The  success  of  any  global  optimisation  heuristic  depends  on  the  geometry  of  the  fitness 
landscape  and  the  underlying  state  space.  Wolpert  and  Macready  (1997)  have  shown  that 
over  all  fitness  landscapes,  no  heuristic  is  better  than  any  other.  Therefore,  if  the  SA 
algorithm  we  developed  in  Section  2.3  is  to  be  effective,  it  must  exploit  some  systemic 
feature  of  the  state  space.  Most  optimisation  techniques  are  sensitive  to  the  number  of 
local  minima  on  the  fitness  landscape.  For  very  rugged  landscapes,  the  SA  algorithm  would 
require  a  very  slow  cooling  of  the  temperature  parameter  to  avoid  becoming  trapped  in 
one  of  many  suboptimal  local  minima  that  may  exist  at  a  large  distance  (measured  by 
the  minimum  number  of  operations  of  the  SA  neighbourhood  function)  from  the  global 
minimum.  In  this  section,  we  develop  a  distance  measure  and  use  Principal  Components 
Analysis  to  better  understand  the  geometry  of  the  state  space. 

Consider  a  linear  supply  chain  of  length  five  (including  the  source  node  0)  where  each 
node  has  consumption  generated  according  to  a  Normal  distribution  N( 5,  3).  Using  brute 
force  enumeration,  we  explore  a  region  of  integer  valued  solutions  to  the  holding  policy 
for  each  node.  That  is,  for  every  possible  combination  of  node  holding  policies  within  the 
region,  the  chain  is  simulated  for  1000  time  steps.  The  region  is  defined  by  setting 


j>i 

and 

j>i 


(Al) 

(A2) 


where  fij  is  the  mean  of  the  distribution  at  node  j.  Thus  the  {H1f'w ,  H-I,!jh)  tuples  for 
nodes  1  <  i  <  4  are  (20,25),  (15,20),  (10, 15),  (5, 10).  The  minimisation  problem  defined 
in  (12)  with  the  second  objective  function  22  P(C'(  —  c\  <  y)  is  used,  with  y  =  —  1  and 
Hmax  =  60.  Here,  60  is  chosen  to  provide  the  maximum  number  of  146  valid  combinations 
where  22  Hi  =  Hmax  within  the  region  defined  by  (Al)  and  (A2),  although  the  following 
results  are  found  to  hold  for  other  values  of  Hmax ■  The  interpretation  of  setting  y  =  —  1 
is  that  desired  consumption  exceeds  actual  consumption,  which  means  the  supply  chain 
is  unable  to  meet  demand.  Henceforth,  we  refer  to  the  objective  using  the  abbreviated 
notation  22  P- 


The  obtained  optimal  solution  for  a  simulation  of  1000  time  steps  is  H*  =  (21, 17, 12, 10) 
which  has  IP  =  0.173.  This  is  a  boundary  solution,  since  node  four  is  at  its  maximum 
value.  The  objective  function  JS?^)  is  graphed  for  each  of  the  146  combinations  where 
22  Hi  =  Hmax  in  Figure  A.l.  As  can  be  seen  from  Figure  A.l,  the  optimal  solution  is  the 
27th  combination.  The  combinations  are  ordered  by  enumerating  between  H\ow  and  the 
Hfgh  for  each  i,  incremented  in  reverse  node  order  and  satisfying  Hmax  =  60.  The  first 
combination  in  Figure  A.l  is  (20,15,15,10)  and  the  last  is  (25,20,10,5). 


Although  Figure  A.l  has  many  local  minima,  this  is  mainly  an  artifact  of  ordering  four 
dimensional  data  on  one  dimension  of  the  graph.  To  understand  the  space  better  we 
measure  the  distance  in  state  space  between  the  best  configurations,  where  the  state  is 
completely  defined  by  the  L-tuple  H .  The  distance  in  state  space  is  defined  as  the  minimum 
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number  of  units  of  stock  that  must  be  switched  from  one  warehouse  to  another  in  order 
to  move  between  two  stocking  policies.  Formally,  for  C  :  x  — >  M, 

C{G,H)=l~YJ\Gi~Hi\  (A3) 

i 

where  Gi,H,L  represent  the  holding  policy  at  node  i  under  policies  G  and  H  respectively. 
Table  A.l  lists  the  10  best  configurations,  each  with  value  less  than  0.03  worse  than  the 
optimal  solution,  along  with  the  five  worst  holding  policies.  The  10  best  holding  policies 
equate  to  points  on  the  eight  local  minima  that  are  below  the  0.2  value  in  Figure  A.l. 
Four  of  the  top  ten  solutions  are  within  distance  one  of  the  optimal  policy  H*,  which  has 
nine  holding  policies  within  distance  one  in  total.  Further,  the  remaining  six  solutions  in 
the  top  ten  are  all  only  two  units  from  H* .  A  result  that  is  not  shown  in  Table  A.l  is  that 
the  first  policy  to  have  a  distance  of  three  or  greater  from  H*  is  the  18th  best  solution 
overall. 

If  we  consider  the  five  worst  policies  in  Table  A.l,  we  observe  that  two  are  seven  units 
distance  away  from  the  optimal  policy.  Seven  is  the  maximum  distance  from  the  optimal 
policy,  and  these  are  the  only  two  policies  that  exist  with  C(H,  H*)  =  7. 

Table  A.l:  Ten  best  and  five  worst  holding  policies  H  ranked  according  to  compared 
to  their  distance  to  the  optimal  solution  H* 


H 

£(H,  H*) 

EP 

(21,  17,  12,  10) 

0 

0.173 

(22,16,12,  10) 

1 

0.186 

(21,  18,  13,  8) 

2 

0.186 

(21,  17,  13,  9) 

1 

0.187 

(20,  18,  12,  10) 

1 

0.19 

(22,  17,  13,  8) 

2 

0.191 

(20,  17,  14,  9) 

2 

0.193 

(21,  19,  11,  9) 

2 

0.193 

(22,  18,  12,  8) 

2 

0.196 

(22,  17,  12,  9) 

1 

0.199 

(22,  16,  13,  9) 

2 

0.203 

(25,  19,  11,  5) 

6 

0.455 

(25,  15,  15,  5) 

7 

0.457 

(24,  20,  11,  5) 

6 

0.459 

(24,  18,  13,  5) 

5 

0.463 

(25,  20,  10,  5) 

7 

0.542 

If  the  value  of  node  1  is  held  constant,  the  remaining  three  dimensions  for  nodes  2  through 
4  can  be  visualised.  Figure  A. 2  shows  the  ^  P  as  the  area  of  the  bubbles  for  each  holding 
policy  when  node  1  equals  22.  We  can  see  that  the  landscape  is  quite  smooth,  since  the 
bubbles  vary  in  size  following  a  regular  pattern. 
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Combination  Number 


Figure  A.l:  P  for  each  brute  force  enumeration  for  a  five  node  supply  chain 


Figure  A. 2:  Bubble  scatter  plot  ofJfP  for  a  five  node  supply  chain  when  node  1  =  22 
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In  order  to  visualise  the  landscape  for  more  than  three  nodes  it  is  necessary  to  perform  a 
dimensionality  reduction.  Principal  Components  Analysis  (PCA)  (Jolliffe,  1986)  is  a  factor 
analysis  method  that  is  used  to  perform  a  variance  maximising  rotation  of  the  original 
state  space.  By  selecting  the  p  dimensions  under  the  rotation  with  the  largest  variance, 
the  data  is  projected  onto  p  orthogonal  factors  with  the  minimum  error  to  the  original 
data  under  the  £2-norm.  The  prinicpal  components  are  given  by  the  first  p  eigenvectors 
of  the  correlation  matrix,  ordered  by  the  magnitude  of  the  associated  eigenvalues. 

The  state  space  includes  the  L  node  holding  policies  in  the  natural  numbers,  not  including 
the  source  node  0,  plus  an  additional  non-negative  real  dimension  for  E^1-  First,  we 
consider  the  brute  force  results  for  the  region  defined  by  (Al)  and  (A2)  regardless  of 
whether  Hmax  =  60  is  satisfied.  This  consists  of  five  columns  of  data  Xi,  1  <  i  <  5,  one 
for  each  dimension  of  the  state  space,  and  1296  rows,  one  for  each  brute  force  combination 
of  node  holding  policies.  The  correlation  between  two  columns  of  data  is  given  by 


cor  (xi,Xj) 


CO v(xi,  Xj) 

(J  j 


where  cr*  is  the  standard  deviation  of  xt .  The  covariance  is  given  by 


(A4) 


CO v(xi,Xj)  =  E((xi  -  pi) {xj  -  Pj)),  (A5) 

where  E  is  the  mathematical  expectation  and  pt  is  E(xf).  The  correlation  matrix  between 
each  column  is  shown  in  Table  A. 2.  Because  the  state  space  is  completely  enumerated 
over  the  1296  observations,  there  exists  no  correlation  between  the  nodes.  We  observe  on 
average,  as  each  node  increases  its  holding  policy,  that  E  P  decreases,  and  the  magnitude 
of  the  correlation  increases  for  nodes  further  down  the  supply  chain.  Therefore,  if  all  nodes 
have  the  minimum  stock  levels  H  =  (20, 15, 10,  5)  we  would  expect  the  marginal  utility  of 
adding  extra  holding  to  be  greatest  at  node  4  and  smallest  at  node  1. 

Table  A. 2:  Correlation  matrix  between  nodes  and  E^  from  brute  force  enumeration  of  a 
five  node  supply  chain 


Node  1 

Node  2 

Node  3 

Node  4 

Ep 

Node  1 

1 

Node  2 

0 

1 

Node  3 

0 

0 

1 

Node  4 

0 

0 

0 

1 

£P 

-0.2055 

-0.3104 

-0.4193 

-0.7688 

i 

The  eigenvectors  and  eigenvalues  are  shown  in  Table  A. 3.  The  percentage  of  variation  in 
the  direction  of  each  eigenvector  is  calculated  from  the  relative  size  of  the  corresponding 
eigenvalue.  The  first  eigenvector,  which  is  a  combination  of  all  four  node  holding  policies 
plus  50%  of  the  E  P  dimension,  captures  39%  of  all  variation.  The  next  three  dimensions 
are  independent  of  ^  P  and  account  for  a  further  60%  of  the  variation.  The  final  eigen¬ 
vector  is  a  reflection  on  a  hyperplane  of  the  first  eigenvector,  which  can  be  thought  of  as 
the  error  of  the  first  eigenvector  in  accounting  for  the  variation  in  ^  P  as  a  function  of 
the  node  holding  policies.  The  final  eigenvector  only  contributes  1%  to  the  total  varia¬ 
tion.  Therefore,  the  underlying  data  is  essentially  four  dimensional,  but  the  redundant 
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Table  A .  3:  Percentage  of  variation  for  principal  components  from  brute  force  enumeration 
of  a  five  node  supply  chain 


Eigenvectors 

Eigenvalues 

Percentage  of  Variation 

(-0.1527,  -0.2307,  -0.3116,  -0.5713,  0.7071) 

1.9516 

39.0% 

(0.9762,  -0.0559,  -0.0922,  -0.188,  0) 

1 

20.0% 

(0.0173,  -0.9422,  0.109,  0.3163,  0) 

1 

20.0% 

(-0.0079,  -0.0521,  0.8862,  -0.4602,  0) 

1 

20.0% 

(0.1527,  0.2307,  0.3116,  0.5713,  0.7071) 

0.0484 

1.0% 

dimension  is  a  combination  of  all  five  of  our  original  dimensions,  so  none  of  the  original 
dimensions  can  be  completely  eliminated. 

The  first  three  principal  components  graphed  in  Figure  A. 3  account  for  80%  of  the  total 
variation.  This  space  is  well  behaved  in  the  sense  that  changes  in  the  second  and  third 
principal  components,  which  are  independent  of  Y2  P,  result  in  smooth  changes  to  the 
first  principal  component  that  captures  the  majority  of  the  variation  in  ^2  IP-  The  other 
relevant  feature  of  Figure  A. 3  is  the  way  in  which  the  data  is  clustered  into  six  layers 
according  to  their  second  and  third  principal  components. 


Figure  A .  3:  First  three  principal  components  from  brute  force  enumeration  of  a  five  node 
supply  chain 

We  now  consider  the  effect  of  the  additional  constraint  ^2  Hi  =  60  on  the  principal  com¬ 
ponents.  Because  P (C-  —  c\  <  y)  is  a  monotonic  decreasing  function  of  Fbi  Vi,  we  consider 
only  the  Pareto  dominant  subset  of  ^  Ht  <  Hmax  given  by  ^  Hi  =  Hmax,  which  contains 
146  different  holding  policy  combinations.  This  introduces  a  relation  between  the  values 
of  the  nodes  which  lie  on  the  Simplex  defined  by  ^  Hi  =  60.  Specifically,  if  we  know  Hj 
for  all  nodes  except  one  node  j  we  can  deduce  that  Hj  =  60  —  ^k-  The  correlation 

matrix  for  the  brute  force  enumeration  given  ^  HL  =  60  is  given  in  Table  A. 4.  Compared 
to  the  correlation  matrix  in  Table  A. 2,  it  can  be  observed  that  Table  A. 4  contains  corre- 
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lations  between  nodes  due  to  the  additional  constraint.  The  only  node  to  have  a  negative 
correlation  with  is  node  4.  The  reason  the  other  nodes  are  positive  is  because  any 
increase  in  say  node  1  will  result  in  a  reduction  in  the  holding  policy  for  some  node  further 
down  the  chain  and  so  ^  P  can  increase  even  when  P (Cf  —  c\  <  y)  decreases.  The  first 
three  nodes  have  decreasing  positive  correlations  with  ^  P  as  the  node  number  increases, 
so  the  sign  but  not  the  ordering  of  the  marginal  utility  has  changed  compared  to  Table 
A. 2. 

Table  A.f :  Correlation  matrix  when  ^  Hi  =  60 


Node  1 

Node  2 

Node  3 

Node  4 

£P 

Node  1 

1 

Node  2 

-0.3333 

1 

Node  3 

-0.3333 

-0.3333 

1 

Node  4 

-0.3333 

-0.3333 

-0.3333 

1 

£P 

0.502 

0.2459 

0.0144 

-0.7623 

1 

The  largest  eigenvalue  in  Table  A. 5  shows  that  42.5%  of  all  variation  can  be  captured 
by  a  combination  of  three  nodes  plus  the  ^  P.  Although  all  four  nodes  have  non-zero 
values,  node  three’s  value  of  0.0117  is  not  significant.  With  three  dimensions,  95.9%  of 
the  variation  is  be  represented,  while  four  dimensions  accounts  for  all  of  the  data.  This 
means  the  fifth  dimension,  which  is  an  equal  combination  of  all  four  node  holding  policies 
is  redundant. 

Table  A .  5:  Percentage  of  variation  for  principal  components  between  nodes  and  fC  P  from 
brute  force  enumeration  of  a  five  node  supply  chain  given  Hi  =  60 


Eigenvectors 

Eigenvalues 

Percentage  of  Variation 

(0.4068,  0.1992,  0.0117,  -0.6177,  0.6428) 

2.1266 

42.5% 

(0.1012,  0.5099,  -0.8266,  0.2155,  0) 

1.3333 

26.7% 

(0.6766,  -0.6499,  -0.2578,  0.2311,  0) 

1.3333 

26.7% 

(-0.3413,  -0.1672,  -0.0098,  0.5183,  0.766) 

0.2067 

4.1% 

(0.5,  0.5,  0.5,  0.5,  0) 

0.0000 

0.0% 

The  first  three  principal  components  for  the  brute  force  enumeration  given  Hi  =  60 
are  graphed  in  Figure  A. 4.  The  Simplex  relation  between  the  nodes  has  reduced  the  six 
distinct  layers  of  clustered  points  to  a  single  plane.  The  first  principal  component  now 
captures  all  variation  due  to  ^  P,  and  changes  to  the  second  or  third  principal  component 
produces  smooth  changes  to  the  first  factor.  Further,  the  minimum  value  along  the  first 
principal  component  occurs  at  a  corner  of  the  plane,  and  other  low  values  are  nearby  in 
this  space,  while  large  values  in  the  first  factor  are  distant.  This  is  the  most  accurate 
visualisation  of  the  state  space  that  is  possible  in  three  dimensions,  and  confirms  the 
intuition  we  developed  earlier  when  analysing  the  distances  in  Table  A.l.  Therefore,  we 
conclude  that  for  this  particular  system  under  the  distance  measure  (A3)  the  geometry  is 
suitably  well  behaved  to  expect  good  performance  using  SA  with  a  relatively  rapid  cooling 
schedule. 
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Figure  A. 4-'  First  three  principal  components  from  brute  force  enumeration  of  a  five  node 
supply  chain  given  ^  H%  =  60. 
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Appendix  B  Simulated  Annealing 

SA  is  a  heuristic  technique  loosely  based  on  the  physics  of  thermal  equilibria  and  the  prop¬ 
erties  of  materials  like  metal  and  glass.  In  such  materials,  adding  heat  disrupts  the  state  of 
the  material  and  through  cooling  the  material  is  annealed  to  alter  its  properties.  Typical 
examples  of  this  process  include  glass  blowing  and  the  tempering  of  steel.  The  SA  algo¬ 
rithm  is  conceptually  modelled  on  this  process  and  is  applied  as  an  heuristic  optimisation 
technique  for  problems  which  are  otherwise  intractable  or  difficult  to  solve.  The  algorithm 
is  traditionally  posed  to  minimise  energy  levels  in  a  system  and  therein  stabilise  it.  The 
algorithm  successively  introduces  one  or  more  perturbations,  disturbances  or  transitions 
into  the  system  in  an  attempt  to  restabilise  it  in  more  desirable  states.  If  the  perturbation 
produces  a  favourable  result,  then  the  new  system  state  is  accepted.  Otherwise,  the  new 
state  is  accepted  on  the  basis  of  some  pre-determined  stochastic  acceptance  function  which 
classically  resembles  the  Boltzmann  distribution. 

Denote  the  space  of  all  possible  states  5?  over  which  the  SA  algorithm  searches.  The  algo¬ 
rithm  is  initialised  by  identifying  the  initial  state  so  £  S  of  the  system  being  annealed  and 
the  cooling  schedule  of  temperatures  (Tj),  whose  elements  Tj  £  T  C  M+  are  monotonically 
decreasing,  where  i  =  j  . . .  J  corresponds  to  an  iteration  counter  of  the  algorithm  for  some 
fixed  number  of  annealing  steps  J.  The  probability  of  accepting  a  transition  depends  on 
the  objective  function,  often  referred  to  as  the  energy  function  Et  :  S2  — »  R,  and  is  defined 

f  ~(gT(g)~gr(s))  1 

P(accept  new  state  s|  current  state  s)  =  min  <  1,  e  * t  t  (Bl) 

for  some  scaling  constant  k,  where  T  £  T  is  the  constant  denoting  the  current  temperature 
level  of  the  algorithm.  The  probability  of  accepting  transitions  leading  to  less  desirable 
states  potentially  allows  the  algorithm  to  escape  local  minima. 

The  algorithm  iteratively  conducts  a  biased  random  walk  over  the  state  space  S?  at  each 
temperature  Tj  £  T,  initially  starting  in  so  £  <S,  as  follows.  Set  the  current  state  of  the 
system  s  =  sq.  For  each  temperature  Tj  in  order:  generate  a  new  state  s;  accept  or  reject 
the  state  s  as  the  current  state  s  according  to  (Bl);  do  this  for  K  time  steps.  This  loop 
of  K  steps  is  referred  to  as  a  Metropolis  loop  after  Metropolis  et  al.  (1953).  The  number 
of  steps  in  the  Metropolis  loop  and  the  cooling  schedule  are  tuning  parameters  that  are 
set  based  on  constraints  on  available  processor  time  and  an  evaluation  of  how  well  the 
problem  is  solved  (usually  this  means  satisfying  some  criteria  of  utility  combined  with 
evidence  that  the  SA  is  able  to  converge  when  the  best  solution  does  not  change  over  a 
large  number  of  steps).  We  expect  SA  will  find  a  near  optimal  solution  even  with  a  rapid 
cooling  schedule  due  to  the  nature  of  the  solution  space  (see  Appendix  A). 
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Appendix  C  Error  Component 

To  illustrate  the  value  of  the  error  component  e  in  Section  4.1,  we  generated  50,000  inde¬ 
pendent  error  terms  and  counted  the  frequency  of  each  outcome,  see  Figure  C.l.  In  this 
figure,  values  outside  of  [—6,  6]  are  not  displayed  as  their  frequency  is  too  low. 
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Figure  C.l:  Frequency  of  outcome  against  observed  error  in  50,000  independent  trials 
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